In large-scale emergencies, people post a lot of information about their status, needs, and abilities to help on social media. In principle, these posts might help emergency management teams get a better picture of the situation and find useful resources, but the number and questionable accuracy of these posts make them less useful than they could be. This project is about developing tools that identify people's intentions related to the emergency, sorting tweets into categories such as requests for help or information, offers of help, announcements of their safety or location, and so on. This problem of intent inference is a key scientific problem in natural language processing and artificial intelligence, with practical uses in a number of areas beyond emergency management, including web search and providing location-aware services. The researchers will attack the intent inference problem by narrowing it to the emergency response domain. First, they will work closely with emergency response teams to identify meaningful categories of intent that align with emergency response needs, in order to guide the collection and labeling of social media posts. Then, they will develop strategies drawn from existing image and natural language processing techniques and informed by the emergency response context to do the categorization work. Finally, they will build and evaluate a tool that uses the categorization algorithms to highlight the social media posts that are most likely to be useful to emergency responders. The work will be used to help develop courses around data science at the lead researcher's school, and the tools will be made publicly available through an open source code and advertised to communities of interest.
To build the set of crisis-specific intent categories, the research team will first analyze existing operational manuals for emergency response including the Incident-Command-System models to extract key processes and initial categories, then refine that set working with experts from the Fairfax Fire and Rescue Department, an advisory committee of social media working group for emergency services at Department of Homeland Security that has members across the country, and members of the project's advisory board. Intent extraction will be modeled as a multilabel classification problem on two dimensions: type of intent, and topical category; this formulation maps well to characteristics of posts (which might contain multiple intents and topics) and scopes the complexity of general intent inference. Datasets will be gathered from prior crisis events and labeled by crowd workers interested in humanitarian work according to the categories identified from the first phase. Features of posts will be constructed from semantic metadata of posts using natural language processing techniques on textual content, image processing techniques on multimedia content and author profiling techniques. Features will include extracting syntactic-semantic patterns that represent declarative and psycholinguistic knowledge as well as ideas from discourse analysis, while features of authors will be drawn from their provided profile information as well as aggregate inferences from their posts. The team will use a multi-task learning framework as the underlying algorithm to leverage relationships between the different categories to be classified. Finally, the developed interface will support faceted browsing by intent, topic, location, and response management process, and be evaluated through training exercises with the research team's practitioner partners.
- Prof. Hemant Purohit
- Rahul Pandey (PhD research assistant, GMU)
- Yogen Chaudhari (MS research assistant, GMU)
- Bahman Pedrood (PhD research assistant, GMU)