Residential Zoning in Pittsburgh

The New York Times recently published an article about zoning in U.S. cities, particularly single-unit detached residential housing. The article did not include Pittsburgh, so I downloaded the zone shapefile from the WPRDC and made my own map.

This blog quickly goes through the steps to make the map and other graphs about the data.

First, load the required libraries and set up the environment:

library(tidyverse)
library(sf)
library(ggmap)
library(janitor)
library(hrbrthemes)

options(scipen = 999)

Read in the shapefile with st_read and inspect the data with glimpse:

shapefile <- st_read("data/Zoning-shp/Zoning.shp")
## Reading layer `Zoning' from data source `/Users/conortompkins/github_repos/blog_hugo_academic/content/post/residential-zoning-in-pittsburgh/data/Zoning-shp/Zoning.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1054 features and 18 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -80.09534 ymin: 40.36161 xmax: -79.86577 ymax: 40.50097
## geographic CRS: WGS 84
glimpse(shapefile)
## Rows: 1,054
## Columns: 19
## $ objectid   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
## $ area       <dbl> 404170.093, 332097.787, 192093.378, 22816.707, 962349.849,…
## $ perimeter  <dbl> 4020.2318, 3522.5948, 2748.8339, 715.3339, 4499.1899, 2498…
## $ zoning_    <int> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19…
## $ zoning_id  <int> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19…
## $ zon_new    <chr> "P", "P", "LNC", "LNC", "P", "HC", "LNC", "R1D-M", "R1D-M"…
## $ shape_leng <dbl> 4020.2318, 3522.5948, 2748.8339, 715.3339, 4499.1899, 2498…
## $ correction <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ full_zonin <chr> "PARKS AND OPEN SPACE", "PARKS AND OPEN SPACE", "LOCAL NEI…
## $ legendtype <chr> "Parks", "Parks", "Local Neighborhood Commercial", "Local …
## $ municode   <chr> "http://library.municode.com/HTML/13525/level4/PIZOCO_TITN…
## $ status     <chr> "Approved", "Approved", "Approved", "Approved", "Approved"…
## $ created_us <chr> "pgh.admin", "pgh.admin", "pgh.admin", "pgh.admin", "pgh.a…
## $ created_da <date> 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-0…
## $ last_edite <chr> "pgh.admin", "pgh.admin", "pgh.admin", "pgh.admin", "pgh.a…
## $ last_edi_1 <date> 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-0…
## $ Shape__Are <dbl> 404170.093, 332097.787, 192093.378, 22816.707, 962349.849,…
## $ Shape__Len <dbl> 4020.2318, 3522.5948, 2748.8339, 715.3339, 4499.1899, 2498…
## $ geometry   <POLYGON [°]> POLYGON ((-79.9068 40.36172..., POLYGON ((-79.9013…

We need to munge the data to get it in shape for analysis. This makes some simple TRUE|FALSE flags for basic zone information and uses case_when to create type, which represents aggregated zone types.

df <- shapefile %>% 
  mutate(residential = str_detect(full_zonin, "RESIDENT"),
         single_unit = str_detect(full_zonin, "SINGLE-UNIT"),
         attached = str_detect(full_zonin, "ATTACHED"),
         type = case_when(residential == TRUE & single_unit == TRUE & attached == FALSE ~ "Single-unit detached residential",
                          residential == TRUE & single_unit == FALSE | attached == TRUE ~ "Other residential",
                          full_zonin == "EDUCATIONAL/MEDICAL INSTITUTION" ~ "Educational/Medical",
                          residential == FALSE ~ "Other non-residential"),
         type = factor(type, levels = c("Single-unit detached residential", 
                                        "Other residential",
                                        "Educational/Medical",
                                        "Other non-residential")),
         alpha_flag = type == "Single-unit detached residential")
## Rows: 1,054
## Columns: 23
## $ objectid    <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
## $ area        <dbl> 404170.093, 332097.787, 192093.378, 22816.707, 962349.849…
## $ perimeter   <dbl> 4020.2318, 3522.5948, 2748.8339, 715.3339, 4499.1899, 249…
## $ zoning_     <int> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 1…
## $ zoning_id   <int> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 1…
## $ zon_new     <chr> "P", "P", "LNC", "LNC", "P", "HC", "LNC", "R1D-M", "R1D-M…
## $ shape_leng  <dbl> 4020.2318, 3522.5948, 2748.8339, 715.3339, 4499.1899, 249…
## $ correction  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ full_zonin  <chr> "PARKS AND OPEN SPACE", "PARKS AND OPEN SPACE", "LOCAL NE…
## $ legendtype  <chr> "Parks", "Parks", "Local Neighborhood Commercial", "Local…
## $ municode    <chr> "http://library.municode.com/HTML/13525/level4/PIZOCO_TIT…
## $ status      <chr> "Approved", "Approved", "Approved", "Approved", "Approved…
## $ created_us  <chr> "pgh.admin", "pgh.admin", "pgh.admin", "pgh.admin", "pgh.…
## $ created_da  <date> 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-…
## $ last_edite  <chr> "pgh.admin", "pgh.admin", "pgh.admin", "pgh.admin", "pgh.…
## $ last_edi_1  <date> 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-02, 2020-06-…
## $ Shape__Are  <dbl> 404170.093, 332097.787, 192093.378, 22816.707, 962349.849…
## $ Shape__Len  <dbl> 4020.2318, 3522.5948, 2748.8339, 715.3339, 4499.1899, 249…
## $ residential <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TR…
## $ single_unit <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TR…
## $ attached    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
## $ type        <fct> Other non-residential, Other non-residential, Other non-r…
## $ alpha_flag  <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TR…

This counts the number of rows per full zone description (full_zonin) and type:

df_zones <- df %>% 
  count(full_zonin, type, sort = TRUE) %>% 
  st_drop_geometry()
## Rows: 59
## Columns: 3
## $ full_zonin <chr> "PARKS AND OPEN SPACE", "LOCAL NEIGHBORHOOD COMMERCIAL", "…
## $ type       <fct> Other non-residential, Other non-residential, Single-unit …
## $ n          <int> 149, 126, 75, 69, 65, 52, 51, 44, 43, 42, 41, 37, 36, 20, …

Create a basic bar chart to show the distribution of type:

df %>% 
  st_drop_geometry() %>% 
  group_by(type, residential) %>% 
  summarize(area = sum(area)) %>% 
  ungroup() %>% 
  mutate(type = fct_reorder(type, area)) %>% 
  ggplot(aes(type, area / 1000000, fill = residential)) +
    geom_col() +
    scale_y_comma() +
    scale_fill_discrete("Is the zone residential?") +
    labs(x = "Zone type",
         y = "Land area in millions of feet") +
    coord_flip() +
    theme_ipsum()

Use a bar chart to show the distribution of full_zonin:

df %>% 
  st_drop_geometry() %>% 
  group_by(full_zonin, residential) %>% 
  summarize(area = sum(area)) %>% 
  ungroup() %>% 
  mutate(full_zonin = fct_reorder(full_zonin, area)) %>% 
  ggplot(aes(full_zonin, area / 1000000, fill = residential)) +
    geom_col() +
    scale_y_comma() +
    scale_fill_discrete("Is the zone residential?") +
    labs(x = "Full zone description",
         y = "Land area in millions of feet") +
    coord_flip() +
    theme_ipsum()

This calculates the total land area zoned for any type of residential housing:

df %>% 
  st_drop_geometry() %>% 
  mutate(single_unit_flag = type == "Single-unit detached residential") %>% 
  filter(residential == TRUE) %>% 
  summarize(total_area = sum(area))
##   total_area
## 1  770835393

This calculates the % of residential zoning that is zoned for single-unit detached residential housing units:

df %>% 
  st_drop_geometry() %>% 
  filter(residential == TRUE) %>% 
  mutate(single_unit_flag = (type == "Single-unit detached residential")) %>% 
  group_by(single_unit_flag) %>% 
  summarize(zone_area = sum(area)) %>% 
  mutate(pct_area = zone_area / sum(zone_area))
## # A tibble: 2 x 3
##   single_unit_flag  zone_area pct_area
##   <lgl>                 <dbl>    <dbl>
## 1 FALSE            335471796.    0.435
## 2 TRUE             435363597.    0.565

This creates a map of the zones, fills them by type, and overlays it on a GoogleMaps basemap. I also insert the boundaries of the City of Pittsburgh.

city_boundary <- st_read("data/Pittsburgh_City_Boundary-shp/City_Boundary.shp")
## Reading layer `City_Boundary' from data source `/Users/conortompkins/github_repos/blog_hugo_academic/content/post/residential-zoning-in-pittsburgh/data/Pittsburgh_City_Boundary-shp/City_Boundary.shp' using driver `ESRI Shapefile'
## Simple feature collection with 8 features and 8 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -80.09534 ymin: 40.36161 xmax: -79.86577 ymax: 40.50097
## geographic CRS: WGS 84
#register_google(key = "your key here")
pgh_map <- get_map(location = c(lat = 40.445315, lon = -79.977104), 
                   zoom = 12, maptype = "roadmap", source = "google")


ggmap(pgh_map) +
  geom_sf(data = df %>% filter(type != "Other non-residential"), aes(fill = type), inherit.aes = FALSE, size = .5, alpha = 1, color = NA) +
  geom_sf(data = city_boundary, inherit.aes = FALSE, alpha = 0, size = 2) +
  coord_sf(crs = st_crs(4326)) +
  scale_fill_manual("Zone type",
                      values = c("#ea60b9", "#4cafc5", "yellow", "light grey")) +
  labs(title = "56% of residential zoned land area is single-family detached residential",
         subtitle = "City of Pittsburgh zoning",
         caption = "@conor_tompkins, data from WPRDC") +
  theme_void()

I used scale_fill_manual to manually set the color palette to match the NYTimes article.

Related