How Staten Island voted in 2016 (and what it might mean for 2018)

Feb 28 2018

In the last month, I’ve been volunteering with a new political advocacy organization, Changing the Conversation Together. The premise of CTC is that through long, in-person conversations, voters can be persuaded to change their minds about issues or elections. The method is called “Deep Canvassing” and was used most prominently to reduce predjudice against transgender people in California.

My contribution to CTC so far has been to gather and clean prior election results, combine them with census data, and make some maps. All of which are great tasks to work on using R!

During the 2018 election cycle, CTC is working in New York’s 11th Congressional District to flip its House seat from Republican to Democrat. NY 11 covers all of Staten Island and a small portion of south Brooklyn. It’s the only congressional district in New York City that is represented by a Republican (Dan Donovan). In 2016, the district voted for Trump over Clinton by a margin of 10 percentage points (54.4% to 44.4%).

In this post, I’ll download and map 2016 presidential election results at the smallest geographic unit available, the election district (ED). One use of a map like this is to help CTC decide which EDs to focus on with their deep canvassing work.

The NYC Board of Elections has certified results of all the 2016 elected offices available for download in CSV format on their website.

First I’ll pull the appropriate CSV into R and get it all tidied up:

# load packages all the packages i'll need
library(tidyverse)
library(janitor)
library(sf)
library(tmap)
library(leaflet)
library(leaflet.extras)

# 2016 presidential results for all of nyc
pres_2016 <- "http://vote.nyc.ny.us/downloads/csv/election_results/2016/20161108General%20Election/00000100000Citywide%20President%20Vice%20President%20Citywide%20EDLevel.csv"

# read in 2016 pres results
pres_2016_results <- read_csv(
  file = pres_2016,
  col_types = cols(.default = "c")
  ) %>%
  clean_names() %>%
  mutate(
    elect_dist = paste0(ad, ed), # combine elect and assemb dist into unique id
    tally = as.integer(tally), # convert vote tally to integer
    candidate = case_when(
      str_detect(unit_name, "Clinton") ~ "Clinton",
      str_detect(unit_name, "Stein") ~ "Stein",
      str_detect(unit_name, "Johnson") ~ "Johnson",
      str_detect(unit_name, "Trump") ~ "Trump"
    )
  ) %>%
  filter(!is.na(candidate)) %>%
  select(elect_dist, candidate, tally)

# tally and tidy 2016 pres results
pres_2016_tidy <- pres_2016_results %>%
  group_by(elect_dist) %>%
  # candidates may have vote tallies on multiple lines so we need to combine
  summarise(
    year = 2016L,
    type = "general",
    office = "pres",
    clinton = sum(tally[candidate == "Clinton"]),
    trump = sum(tally[candidate == "Trump"]),
    stein = sum(tally[candidate == "Stein"]),
    johnson = sum(tally[candidate == "Johnson"])
  ) %>%
  gather(candidate, tally, -(elect_dist:office)) %>%
  arrange(elect_dist)

pres_2016_tidy

## # A tibble: 21,384 x 6
##    elect_dist  year type    office candidate tally
##    <chr>      <int> <chr>   <chr>  <chr>     <int>
##  1 23001       2016 general pres   clinton     171
##  2 23001       2016 general pres   trump       572
##  3 23001       2016 general pres   stein         7
##  4 23001       2016 general pres   johnson      21
##  5 23002       2016 general pres   clinton     156
##  6 23002       2016 general pres   trump       573
##  7 23002       2016 general pres   stein        15
##  8 23002       2016 general pres   johnson      13
##  9 23003       2016 general pres   clinton     167
## 10 23003       2016 general pres   trump       559
## # … with 21,374 more rows

I’ve got every election district in New York City here but I want to limit my mapping to only EDs contained in the 11th Congressional District. The easiest way I’ve found to determine these EDs is by downloading the election results for the 2016 congressional election in NY 11. Then I can filter the presidential results to only those EDs.

# 2016 ny11 results
ny11_2016 <- "http://vote.nyc.ny.us/downloads/csv/election_results/2016/20161108General%20Election/00002000011Crossover%20Representative%20in%20Congress%2011th%20Congressional%20District%20EDLevel.csv"

# read in ny11 results and create unique vector of EDs in ny11
ny11_eds <- read_csv(
  file = ny11_2016,
  col_types = cols(.default = "c")
  ) %>%
  clean_names() %>%
  mutate(elect_dist = paste0(ad, ed)) %>%
  distinct(elect_dist) %>%
  pull()

# filter 2016 pres results to only ny116
ny11_pres_2016_tidy <- pres_2016_tidy %>%
  filter(elect_dist %in% ny11_eds)
ny11_pres_2016_tidy

## # A tibble: 1,716 x 6
##    elect_dist  year type    office candidate tally
##    <chr>      <int> <chr>   <chr>  <chr>     <int>
##  1 41080       2016 general pres   clinton      34
##  2 41080       2016 general pres   trump        33
##  3 41080       2016 general pres   stein         0
##  4 41080       2016 general pres   johnson       2
##  5 45007       2016 general pres   clinton      16
##  6 45007       2016 general pres   trump        20
##  7 45007       2016 general pres   stein         0
##  8 45007       2016 general pres   johnson       0
##  9 45008       2016 general pres   clinton     123
## 10 45008       2016 general pres   trump       263
## # … with 1,706 more rows

Now I’ve got my tidy target election districts and the vote tallies for each candidate. Before I put these results on a map, I’ll calculate vote percentages and margins that I want to include in my map. I also want to identify particular EDs where the margin between Trump and Clinton was pretty close. These more competitive EDs might be good to target for voter persuasion in the lead up to the 2018 election. To do that I need to reshape my data a bit and then join the vote tallies with a shapefile of ED boundaries.¹

# define function to convert to wide and calc vote proportion for each ed
widen_add_tot_prop <- function(x) {
  cand_totals <- x %>%
    count(elect_dist, candidate, wt = tally) %>%
    mutate(candidate = paste0(candidate, "_tot")) %>%
    spread(candidate, n)
  
  dist_totals <- x %>%
    count(elect_dist, wt = tally) %>%
    rename(total_votes = n) %>%
    left_join(cand_totals, by = "elect_dist")
  
  x %>%
    add_count(elect_dist, wt = tally) %>%
    mutate(prop = tally / n) %>%
    select(-tally, -n) %>%
    mutate(candidate = paste0(candidate, "_prop")) %>%
    spread(key = candidate, value = prop) %>%
    left_join(dist_totals, by = "elect_dist") %>%
    select(elect_dist, year, type, office, total_votes, everything())
}

# convert to wide, calculate dem/rep vote margin, select vars
ny11_margin <- ny11_pres_2016_tidy %>%
  widen_add_tot_prop() %>%
  mutate(d_r_margin = clinton_prop - trump_prop) %>%
  select(elect_dist, total_votes, d_r_margin,
         clinton_prop, trump_prop, johnson_prop, stein_prop,
         clinton_tot, trump_tot, johnson_tot, stein_tot)

# read in geojson of ny11 ed polygons
ny11_link <- "https://raw.githubusercontent.com/mfherman/mattherman/master/static/shp/ny11_ed.geojson"
ny11_sf <- read_sf(ny11_link)

# join vote tallies with sf geo obhect
ny11_pres_to_map <- inner_join(ny11_sf, ny11_margin) %>%
  mutate(elect_dist = paste("ED", elect_dist))

# pull out eds that leaned repub
ny11_pres_close_rep <- ny11_pres_to_map %>%
  filter(d_r_margin >= -0.1 & d_r_margin < 0)

# pull out eds that leaned dem
ny11_pres_close_dem <- ny11_pres_to_map %>%
  filter(d_r_margin >= 0 & d_r_margin <= 0.1)

There are a few packages you can use to create interactive Leaflet maps directly from R. Recently, I’ve been using tmap, but leaflet from RStudio and mapview are also good options.²

First, I’ll define a couple of helper function to format percentages and margins on my map. Next I’ll use tmap to define the features, fill colors, basemap, and popups of my choropleth map of 2016 presidential election results.

Then I’ll add orange borders to the EDs that were close but leaned Democratic and green borders to those that leaned Republican.³ Finally, I’ll add a cool bonus leaflet feature, a search box that identifies EDs.

# define a little helper function to format percentages for my map
make_pct <- function(x, digits = 1) {
  paste0(formatC(x * 100, digits = digits, format = "f"), "%")
}

# define a little helper function to format margins for my map
make_margin <- function(x, digits = 1) {
  if_else(x > 0,
          paste0("+",formatC(x * 100, digits = digits, format = "f")),
          formatC(x * 100, digits = digits, format = "f")
          )
}

# define tmap object, fill, popup vars, basemap
my_map <- tm_shape(ny11_pres_to_map, name = "2016 Presidential") +
  tm_fill(
    col = "d_r_margin", # select d/r margin column to define fill colors
    palette = "RdBu", # select red/blue diverging palette
    style = "cont", # continous fill rather than bins
    breaks = seq(-1, 1, by = 0.2),
    title = "D/R Margin",
    textNA = "No Votes",
    id = "elect_dist", # define popup label id
    popup.vars = c(
      "Dem/Rep Margin" = "d_r_margin", # name vars for popups
      "Total Votes" = "total_votes",
      "Clinton %" = "clinton_prop",
      "Trump %" = "trump_prop",
      "Johnson %" = "johnson_prop",
      "Stein %" = "stein_prop"
    ),
    popup.format = list(
      d_r_margin = list(fun = make_margin), # nice formatting for popup numbers
      total_votes = list(format = "f", digits = 0),
      clinton_prop = list(fun = make_pct),
      trump_prop = list(fun = make_pct),
      johnson_prop = list(fun = make_pct),
      stein_prop = list(fun = make_pct)
      )
  ) +
  tm_borders(col = "darkgray") +
  tm_shape(ny11_pres_close_dem, name = "Lean Dem") +
  tm_borders(col = "orange", lwd = 2) + # add lean dem borders
  tm_shape(ny11_pres_close_rep, name = "Lean Rep") + 
  tm_borders(col = "green", lwd = 2) + # add lean rep borders
  tm_view(
    alpha = 0.85, # a little less transparent than tmap default
    basemaps = "Stamen.TonerLite", # pick a pretty basemap
    legend.position = c("right", "bottom")
    )

my_pretty_map <- tmap_leaflet(my_map) %>% # convert tmap obj to leaflet obj
  addSearchFeatures( # add search box
    targetGroups = "2016 Presidential",
    options = searchFeaturesOptions(
      zoom = 14,
      openPopup = TRUE,
      collapsed = FALSE,
      position = "topright",
      textPlaceholder = "Search EDs..."
      )
    )

my_pretty_map

Hey, that’s a pretty good map! In a future post I’ll add more contextual data to the map like sociodemographics, turnout rates, and prior year election results. This additional information will help CTC identify the best EDs to target.

For ease of this demo, I’ve posted a geojson of election district polygons in NY 11 on my GitHub. I accessed the original shapefile from the NYC Department of City Planning’s BYTES of the BIG APPLE Archive.↩
In fact, in the map I’m going to make, I will convert my tmap object to a leaflet object and then add some extra leaflet features via the leaflet.extras package.↩
One downside to using tmap to create Leaflet maps is that you lose some ability to customize layers and legends. I wanted to add a legend that described the green and orange borders as well as customize the scale in the margin legend, but wasn’t able to.↩

Matt Herman