Overview
Data fusion is a contemporary idea that multiple data sources can be combined to create more complete, accurate or useful information. The value of data fusion applies to historical data as well as modern, although it is rarely applied with a historical focus. In this project, we demonstrate the fusion of two highly detailed historical datasets to produce spatially-explicit demographic maps of the community of State College, PA for two time-periods (1920 and 1930).
Historical spatial data came from building footprints digitized off of Sanborn Fire Insurance Maps, while the demographic information came from manuscript U.S. Census records accessible through Ancestry.com (library-edition), and digitized Penn State student directories.
Historical data is often messy and we used ABBY FineReader OCR, and regular expressions as implemented in the grep library of R to prepare the data.
Results
Following data preparation, 72% of 3,144 historical 1920 demographic records were matched to a specific building location whereas 88% of 1930 records were matched. The results are spatially-explicit datasets of demographic information that can be used to visualize the spatial dimensions of population and socio-cultural change through time.
Interactive ArcGIS Online maps of U.S. Census records searchable by last name
Interactive ArcGIS Online maps of Penn State Student directories searchable by last name
Interactive ArcGIS Online maps of U.S. Census and student directories searchable by last name
People
Alia Horvath, Kiersten Hudson, Connor Henderson, Jack Swab, Tara LaLonde, Heather Ross, Nathan Piekielek, Albert Rozo
Presentations
Henderson, Swab et al. (2017). PAGIS Conference.
Acknowledgements
Support for this project has been provided by the Penn State University Libraries Bednar Internship Program.