Linkage of Traffic Crash and Hospitalization Records for Enhanced Public Health Surveillance: Methods for Probabilistic Matching with Limited Identifiers

Monday, June 20, 2016: 10:30 AM
Kahtnu 1, Dena'ina Convention Center
Sarah Conderino , New York City Department of Health and Mental Hygiene, New York, NY
Lawrence Fung , New York City Department of Health and Mental Hygiene, New York, NY
Slavenka Sedlar , New York City Department of Health and Mental Hygiene, New York, NY
Jennifer Norton , New York City Department of Health and Mental Hygiene, New York, NY
BACKGROUND:  Motor vehicle traffic (MVT) crashes kill or seriously injure approximately 4,250 people in New York City (NYC) each year. Traditionally, NYC surveillance practices use hospitalization and crash data separately to monitor trends in MVT-related injuries, but key information linking crash circumstances to health outcomes is lost when analyzing these data sources in isolation. Our objective was to match crash reports to hospitalization records to create a traffic injury surveillance dataset that can be used to describe crash circumstances and related injury outcomes. The linkage of the two systems presents a unique challenge since the system tracking crashes and the system tracking hospitalizations and emergency department (ED) visits lack key identifying data such as names and dates of birth.

METHODS:  NYC Department of Transportation provided electronic records based on reports of motor vehicle crashes submitted to the New York State Department of Motor Vehicles for all crashes occurring in NYC from 2009-2013. New York Statewide Planning and Research Cooperative System (SPARCS) ED and hospitalization administrative data from NYC hospitals were used to identify unintentional MVT-related injuries using external cause of injury codes (ICD-9-CM E-Codes:E810-E819). Since the two systems do not share unique individual identifiers, probabilistic record linkage was conducted using LinkSolv9.0 following national Crash Outcome Data Evaluation Systems (CODES) methodology. Data were matched by the following variables: age and gender of the individual, date and hour of crash/hospitalization, crash/hospital county, crash role, crash type, and injury location. Data were blocked on age, sex, and either crash role or crash/hospital county.  Sensitivity/specificity calculations and frequency diagnostics were conducted to validate linkage results.

RESULTS:  From 2009-2013, there were 1,054,344 individuals involved in MVT crashes in NYC and 280,340 ED visits and hospitalizations from MVT-related injuries. The completed match resulted in 145,003 linked pairs, giving a match rate of 52% of the total MVT-related SPARCS records. This match had a sensitivity of 74% and a specificity of 93%. Frequency distributions comparing linked and unlinked records were similar by age, sex, and county, indicating no apparent biases in the match by these variables. Frequency distributions varied by crash role and accident type, with a higher proportion of matches occurring among MV drivers and collisions between multiple MVs.  

CONCLUSIONS:  Performing a probabilistic match between MVT crash reports and hospitalization records is possible with a limited set of identifying variables. These linked data will inform traffic safety policies by providing new information on how crash circumstances translate to health outcomes.