The Umati Project: Monitoring Dangerous Speech Online
The Umati project emerged out of concern that mobile and digital technologies may have played a catalyzing role in Kenyan 2007/08 post-election violence. The project seeks to better understand the use of dangerous speech in the Kenyan online space. The project monitors particular blogs, forums, online newspapers, Facebook and Twitter. Online content monitored includes tweets, status updates and comments, posts, and blog entries.
When an example of hateful or inflammatory speech is found on any of the aforementioned platforms, the research team uses Susan Benesch’s Dangerous Speech Guidelines to classify the example according to its dangerousness, or capacity to inspire or provoke violence. Professor Benesch’s five-part framework around dangerous speech has enabled the Umati project to develop a methodology for the collection and analysis of online hate speech. Susan Benesch has advised throughout the project.
Umati has therefore been working towards the following:
To propose both a workable definition of hate speech and a contextualized methodology for online hate speech tracking, that can be replicated locally and in other countries.
To collect and monitor the occurrence of hate speech in the Kenyan online space.
To further education on the possible outcomes of hate speech, so as to promote civil communication and interaction in both online and offline spaces.
Some of the research questions guiding Umati work include:
What do Kenyans understand hate speech to be?
What events and issues influence hate speech online?
Who/what are the key drivers of online hate speech in Kenya?
Has online hate speech catalyzed offline (violent) events?
The Umati Project has relied on a manual, largely human process for collecting and categorizing online hate speech. Human input proved necessary for accurate reviewing of local vernacular languages and local vocabulary, to create a database of inflammatory speech. More on the methodology used can be found in the Umati Phase 1 Final Report.
Umati Phase II looks at employing Machine Learning (ML) techniques and Natural Language Processing (NLP) to detect, collect, select, and sort hate and dangerous speech from the Kenyan online space. We are looking to automate aspects of the current Umati process in order to improve the scalability of the system.