top of page

AI Case Study

Researchers from various universities and labs enhance flood data collection with natural language processing and machine vision

Researchers at UC Berkeley, University of Dundee, Oak Ridge National Laboratory, Tufts University and Blue Urchin LLC have employed social media and crowdsourcing data to address the lack of hyper-resolution datasets for urban flooding. Data gathered from Twitter and a crowdsourcing app, MyCoast, are analysed using natural language processing and computer vision. The complementary data gathering method can improve detailed flooding risk analysis, urban flooding control, and the validation of hyper-resolution numerical models.


Public And Social Sector

Public Services

Project Overview

Researchers have presented a "new approach to collect and process data for urban flooding research using Natural Language Processing and Computer Vision (CV) techniques. These techniques are shown promising to extract hyper-resolution data with a wide coverage to support urban flooding issues."

Reported Results

"The present study shows that social media and crowdsourcing can be used to complement the datasets developed based on traditional remote sensing and witness reports. Applying these methods in two case studies, we found these methods are generally informative in flood monitoring. Twitter data is found weakly correlated to precipitation departure. We determined a length scale of tweet volume pattern, at which the data points are most clustered. The computer vision processed crowdsourcing data is compared against the road closure data. The results show that computer vision still has a room to improve, especially in coastal areas. These two methods are compared and a series of recommendation is given to improve the big data based flood moni- toring in the future."


Named Entity Recognition (NER)
Machine vision


Digital Data

Digital Data Management


"Urban flooding is a global problem that costs lives and money. In 2010 alone, 178 million people suffered from floods. The total economic losses in 1998 and 2010 both exceeded $40 billion (Jha et al., 2012). Urban floods can be caused by a variety of reasons, including natural hazards of river overflow, coastal storm surge, sea-level rise, flash floods, groundwater seepage, sewer overflow, lack of permeability, and lack of city management.

As urbanization proceeds and climate change intensifies, urban planners and city managers are facing the challenge of preparing for and mitigating flood damage. They need tools to monitor and predict the event for emergency response and development planning.

Monitoring and predicting urban floods needs high-resolution data with good coverage. High-resolution data can capture the variation of flood flows among streets or parcels, so that the heterogeneity of flood flows caused by heterogenous urban landscape can be captured. In this study, we define data that can reflect the variation on the parcel and street scale as “hyper-resolution” data. In addition to resolution, it is important to have a good coverage of flood data to obtain complete information."



Twitter feed, crowd-sourcing data and authorised data

There are two ways to contribute to the database, i.e. via the web and a mobile app. The user interface of the app is shown in Fig. 2. The system currently contains over 5000 flood photographs, and most of the photos were collected through the mobile app.
The app uses the phone's sensors to establish location and date/time information. Users are prompted to take a photograph and then may optionally add written comments (Fig. 2). Users are also shown a chart with tide timing so that they can try to optimize the timing of photo- graphs with peak tides.

bottom of page