Census

Analyzing major commuter routes in Allegheny County

Intro In this post I will use the Mapbox API to calculate metrics for major commuter routes in Allegheny County. The API will provide the distance and duration of the trip, as well as turn-by-turn directions.

Mapping BosWash commuter patterns with Flowmap.blue

This map shows the commuter patterns in the Northeast Megalopolis/Acela Corridor/BosWash metro area. I pulled the data from the Census Longitudinal Employer-Household Dynamics (LODES) system via the {lehdr} package. The map was created through the Flowmap.

How Many Pittsburghers Cross the River to Get to Work

This post focuses on how many rivers Pittsburghers cross to get to work. I use the U.S. Census Bureau LEHD Origin-Destination Employment Statistics (LODES) dataset to draw lines between “home” census tracts and “work” census tracts, and then count how many “commuter lines” intersect with the 3 main rivers in Pittsburgh.

Map Census Data With R

This talk was presented on May 30th, 2019 at Code For Pittsburgh. Before we dive in, this presentation assumes that the user has basic familiarity with tidyverse, mainly dplyr.

Clustering Allegheny County Census Tracts With PCA and k-means

In this post I will use the census API discussed in the last post to cluster the Allegheny County census tracts using PCA and k-means. Setup library(tidyverse) library(tidycensus) library(tigris) library(sf) library(broom) library(ggfortify) library(viridis) library(janitor) library(scales) library(ggthemes) options(tigris_use_cache = TRUE) theme_set(theme_minimal()) census_vars <- load_variables(2010, "sf1", cache = TRUE) Census tracts are small geographic areas analogous to local neighborhoods.

Exploring Allegheny County With Census Data

This post explores Allegheny County and Pennsylvania through census data. I use the tidycensus and sf packages to collect data from the census API and draw maps with the data.

Exploring 311 Data With PCA

Principal Component Analysis in R Principal Component Analysis is an unsupervised method that reduces the number of dimensions in a dataset and highlights where the data varies. We will use PCA to analyze the 311 dataset from the WPRDC.