nycgeo: An R package to get spatial data and census estimates for NYC

The nycgeo package contains spatial data files for various geographic and administrative boundaries in New York City as well as tools for working with NYC geographic data. Data is in the sf (simple features) format and includes boundaries for boroughs (counties), public use microdata areas (PUMAs), community districts (CDs), neighborhood tabulation areas (NTAs), census tracts, and census blocks. In the future, more boundaries will be added, such as city council districts, school districts, and police precincts.

Additionally, selected demographic, social, and economic estimates from the U.S. Census Bureau American Community Survey can be added to the geographic boundaries in nycgeo, allowing for contextualization and easy choropleth mapping. Finally, nycgeo makes it simple to access a subset of spatial data in a particular geographic area, such as all census tracts in Brooklyn and Queens.

nycgeo is currently hosted on GitHub and can be downloaded here:

Predicting shelter entry using natural language processing of homebase case notes

In collaboration with the Center for Innovation through Data Intelligence and the NYC Department of Homeless Services, this project used Natural Language Processing of homelessness prevention case notes to predict an individual’s risk of shelter entry. Specifically, the study investigated the ways unstructured case notes can be used to learn more about individuals using Homebase homelessness prevenetion services in New York City. Are there words, phrases, or topics that occur more frequently in the unstructured case notes of individuals who enter shelter after using Homebase services as compared to those who do not? And can a predictive model that assesses the probability of shelter entry based on structured data be improved by incorporating insights from unstructured case notes?

The subway as fourth place: anomie, flânerie and the “crush of persons”

As part of research practicum course at Hunter in Fall 2016, we conceieved and conducted a mixed methods research project to assess social behavioir and interaction on the New York City subway. We collected more than 4,000 detailed observations of passenger behahvoir as well as in-depth “subway diaries” from eighteen research participants. Using logistic regression, we modeled the factors that influence how passengers direct their gaze and configure their bodies while riding the subway. The diaries helped us interpret and understand the observations.

The results of the study have been published in the peer-reviewed journal, Applied Mobilties. The research was also covered by the [Daily News] ( and CBS New York.