AI Case Study

University of Rochester researchers have developed an app that monitors social media to identify health risks posed by restaurants using natural language processing and machine learning

University of Rochester developed an app to analyze social media and predict outbreak of food-borne illnesses using geotags.

Industry

Public And Social Sector

Education And Academia

Project Overview

"Computational approaches to health monitoring and epidemiology
continue to evolve rapidly. We present an end-to-end system, nEmesis, that automatically identifies restaurants posing public health risks. Leveraging a language model of Twitter users’ online communication, nEmesis finds individuals who are likely suffering
from a foodborne illness. People’s visits to restaurants are modeled by matching GPS data embedded in the messages with restaurant addresses. As a result, we can assign each venue a “health score” based on the proportion of customers that fell ill shortly after visiting it. Statistical analysis reveals that our inferred health score correlates (r = 0.30) with the official inspection
data from the Department of Health and Mental Hygiene (DOHMH). We investigate the joint associations of multiple factors mined from online data with the DOHMH violation scores and find that over 23% of variance can be explained by our factors. We demonstrate that readily accessible online data can be used to detect cases of foodborne illness in a timely manner. This approach offers an inexpensive way to enhance current methods
to monitor food safety (e.g., adaptive inspections) and identify potentially problematic venues in near-real time."

Reported Results

"Citations for health violations in 15 percent of inspections using nEmesis, compared to 9 percent using the random system.
Researchers estimate that these improvements to the efficacy of the inspections led to 9,000 fewer food poisoning incidents and 557 fewer hospitalizations in Las Vegas during the 3 month period of study."

Technology

"Leveraging a language model of Twitter users’ online communication, nEmesis finds individuals who are likely suffering
from a foodborne illness. People’s visits to restaurants are modeled by matching GPS data embedded in the messages with restaurant addresses. As a result, we can assign each venue a “health score” based on the proportion of customers that fell ill shortly after visiting it."

Function

R And D

Core Research And Development

Background

nEmesis analyzes tweets and creates alerts about outbreaks and identifies the source of outbreaks

Benefits

Data

"Restaurants in DOHMH inspection database 24,904
Restaurants with at least one Twitter visit 17,012
Restaurants with at least one sick Twitter visit 120
Number of tweets 3,843,486
Number of detected sick tweets 1,509
Sick tweets associated with a restaurant 479
Number of unique users 94,937
Users who visited at least one restaurant 23,459

The model is always evaluated on a static independent held-out set of 1,000 tweets. The model M achieves 63% precision and 93% recall after the final learning iteration. Only 9,743 tweets were adaptively labeled by human workers to achieve this performance: 6,000 for the initial model, 1,176 found independently by human
computation, and 2,567 labeled by workers as per M’s request"