Validating Twitter As a Data Source for Foodborne Illness Outbreak Detection in New York City

Tuesday, June 6, 2017: 4:40 PM
410B, Boise Centre
Katelynn Devinney , New York City Department of Health and Mental Hygiene, Long Island City, NY
Adile Bekbay , New York City Department of Health and Mental Hygiene, Queens, NY
Thomas Effland , Columbia University, New York, NY
Luis Gravano , Columbia University, New York, NY
David Howell , New York City Department of Health and Mental Hygiene, Queens, NY
Daniel Hsu , Columbia University, New York, NY
Daniel O'Hallorhan , New York City Department of Health and Mental Hygiene, New York, NY
Chandrajeet Padhy , New York City Department of Health and Mental Hygiene, Queens, NY
Vasudha Reddy , New York City Department of Health and Mental Hygiene, Queens, NY
Faina Stavinsky , New York City Department of Health and Mental Hygiene, New York, NY
HaeNa Waechter , New York City Department of Health and Mental Hygiene, Queens, NY
Sharon Balter , New York City Department of Health and Mental Hygiene, New York, NY

BACKGROUND:  Annually, the New York City (NYC) Department of Health and Mental Hygiene (DOHMH) manages over 4,000 foodborne illness (FI) reports received via the citywide complaint system (311) and identified on Yelp and detects about 30 outbreaks. There are approximately 24,000 restaurants, 15,000 food retailers and >8.5 million residents in NYC. Many FI incidents likely remain unreported. DOHMH sought to incorporate and validate an additional social media data source, Twitter, to enhance FI complaint and outbreak detection efforts.

METHODS:  DOHMH collaborated with Columbia University to develop a text mining algorithm that identifies tweets indicating FI. Twitter data are received via a targeted API query that searches for FI key words and uses metadata to select for tweets with a possible NYC location. Each tweet is assigned a sick score between 0–1; those meeting a threshold value of 0.5 are manually reviewed by an epidemiologist, and a survey link is tweeted to users who have tweeted about FI, requesting more information regarding the date and time of the FI event and restaurant and user contact information. Survey data are used to validate complaints and are incorporated in a daily analysis using all sources of complaint data to identify restaurants with multiple FI complaints within a 30-day period. This system was launched on November 28, 2016.

RESULTS:  During November 28, 2016–December 29, 2016, 1,480 tweets qualified for review (49/day on average); 556 (37.6%) indicated FI illness in NYC, and 435 (29.4%) were tweeted a survey link (121 FI tweets were either deleted by the Twitter user or were tweets from a user who was already sent a survey for the same FI incident). The survey tweets resulted in 13 likes, three retweets, 46 replies, and 88 survey link clicks. Eight Twitter users submitted surveys (response rate 1.8%). All confirmed FI and were not reported via 311/Yelp; five completed phone interviews. No outbreaks were identified.

CONCLUSIONS:  The use of Twitter for FI outbreak detection continues to be validated; however the initial identification of new complaints not otherwise reported to 311/Yelp suggests this will be a useful tool. Future plans include using the system feedback data to increase the sensitivity and specificity of the text mining algorithm. In addition, we intend to share this system with other health departments so that they might incorporate Twitter in their outbreak detection and public health surveillance activities.