Projects and prizes
Hack Hate 2020
An ecosystem of tools
You canβt predict what will come out of a hackathon - but thereβs always a wide range of projects and itβs in this diversity of approach that we find the most exciting innovations. An extraordinary and emergent theme from this yearβs hack was just how well the projects could integrate with each other.
As you can see in the projects graph, an ecosystem of potentially interoperable tools formed - each with valuable outputs that others might use as inputs.
Some projects focussed on seeking out and collating information about hate speech in popular social networks, others found ways for people or machine learning tools to categorise hatred quickly, while others provided presentation layers for public safety practitioners and agencies to make use of those outputs...
Open sourced, license TBC
Hate Detector
Joe, Cathy, Raj, Jasper, Markela
π Winner: Best use of AI / ML (AWS)
π Winner: Best use of location (ESRI)
Problem
To understand the networks and patterns of hate and how such content is spread.
The dissemination of hate speech online and across social media continues to be a significant issue for which there is no quick-fix. Despite attempts by platforms such as Twitter and Facebook, and increasing pressure from Governments to solve the problem, there is yet to be a consistent way to stop the spread of hate. As it is excessively authoritarian to stop the posts being created, and platforms take time to deal with reports, it is useful to understand the networks and patterns of hate and how such content is spread.
Solution
Utilising state-of-the-art machine learning models, our project is able to scrape social media platforms and classify comments as either hate speech or not hate. The scraping is designed to focus on a specific real-world event - the US election as an example - and once classified the comments and their links are examinable and visualised as a graph. In doing so it is possible to build a case against prolific contributors to hate speech and/ or understand how it is spread across platforms and communities online.
Open sourced under the GNU General Public License (GPL)
Sweep Hate
Dmitry, Devesh
Problem
Sweep Hate addresses the problem of viewing undesirable hateful content online, seeking support when exposed to the content, and reporting such content.
Solution
Sweep Hate is a browser plugin that automatically censors undesirable content online, provides means to reach out to supporting organisation with only a couple of clicks, and contains an evidence-gathering tool with quick means to report the content to multiple parties.
Open sourced, license TBC
POLAR - Portraying OnLine hARm
Piera, Fawzia, Giulio, Johnny, James, Nathaniel
π Winner: Prediction of Hate Crime
Problem
Currently, many agencies in the UK do not have a clear understanding of what is the sentiment surrounding online opinions on key themes such as LGBTQ+, environment, politics etc. A clear breakdown of positivity and negativity is currently not available and can be accessed only when expensive analyses are carried out - which some agencies cannot afford sometimes.
Solution
Our aim was to create a dashboard to portray a picture of the sentiment (positive/negative/neutral) around 4 key themes in the UK: immigration, politics, LGBTQ+ and climate action. We scraped tweets, ran a text followed by sentiment analysis and compared data across 24 months, 12 from 2015/2016 and 12 from 2019/2020. Our aim was to show how the polarisation around the above themes changed throughout the year and whether it was possible to correlate spikes of negative/positive sentiments on these themes to key UK events in those periods and between those periods. Moreover, we also produced wordclouds to highlight how the general debate/discourse over these themes changed from a semantic point of view too.
Open sourced under the GNU General Public License (GPL)
Rate the Hate
Katie, Ollie
Problem
Hate speech is difficult to classify. One of the reasons for this is that different people have different perceptions of what constitutes hate speech. Another is that there isn't enough labelled data.
Solution
This project has built a platform for classifying hate speech data. This consists of a web app, which will present users with a piece of text. The user will then be asked whether it is hate speech or not. If they click yes, they are asked what subcategory it belongs to (racism, homophobia, etc). The app also records users' demographic information, so that once enough data has been labelled, it will be possible to look at differences in perception across demographic groups.
BAE Systems blog post: hacking out hate crime
HateX - Search and Analysis tool
Team BAE Systems Applied Intelligence
Chris, Tigs, Danni, Nickita, Max, Naomi, Monique, Dan, Toby, Thomas
π Winner: Detection of Hate Crime
Problem
Organisations in receipt of reporting about hate crimes/incidents (mainly law enforcement agencies, but also charities/NGOs) receive them in a wide variety of disparate formats (e.g. reports taken through police responding to incidents, 101 calls, online or telephone referrals from third parties etc) which are not easy to search or cross-reference with other information held on separate databases or systems. With efforts being made (including by other teams) to make it easier to report online hate crimes, the volume of reports being received is likely to increase dramatically. Identifying other information of relevance (e.g. the user of particular email addresses or social media handles) would assist in more quickly and easily detecting the perpetrators of hate crimes and bringing them to justice e.g. through arrest and prosecution.
Solution
Our proposed solution - that could be utilised by any agency in receipt of hate crime reporting - is a simple web based tool which searches over hate crime incident report data ingested into Elasticsearch using a NiFi pipeline to extract and stanrdarise entities. Our demonstration model incorporated both standard web/MS office based hate referral forms and raw data generated from online platforms such as Twitter and Instigram (we use the Twitter data fields highlighted by Project 11 as the kind of information that could be flagged to this tool). Once in Elasticsearch, the data can be viewed, searched over and visualised using Kibana; however to make it easier for a non-technical audience, we have designed a user friendly React front end which allows users to conduct a free text search over all data the tool has access to.
Open sourced, license TBC
Works with the demo release of Parallel Dots
Hate Networks
Team Naimuri
Phil, Kieran, Jonathan, Louis
Problem
Map interactions of purveyors of hate speech across social media. Allow the system to be queried to help investigations.
Solution
As a proof of concept we used a Python script and Selenium to scrape twitter. We performed sentiment analysis using ParallelDots and dumped our findings into a relational database hosted on AWS. Then we set up an API Gateway and a Lambda to query the database with any number of Twitter handles and return some json that represents the behaviour of those users and their interactions. A React UI was then built to make it easy to access.
The delivered product was a website which displayed analysis on the overlap between twitter users in terms of retweets, likes and comments. It showed scores for sentiment (both user averages and individual posts) and allowed the user to discover the most problematic twitter users in a network.
Open sourced, license TBC
Victim Report and Support
Team Help 2 Report
Andrew, Vincent, Sydney, Erica, Billy
π Winner: Reporting of Hate Crime
Problem
Hate crime and hate incidents are a serious issue facing people of East and Southeast Asian descent, and the situation has worsened in particular since the start of the coronavirus. There has been a 300% increase in hate crimes towards people of East/Southeast Asian heritage in the UK, and a 900% increase in online hate speech towards China. In addition, many other hate crimes and incidents go unreported by the Asian community as a result of lack of awareness, language/cultural barriers, and limited digital literacy. The Asian community is dispersed with limited funding to support existing community centres. There is a large gap in the reporting of and awareness of hate crime that needs to be urgently addressed.
Solution
For third party Asian community organisations supporting victims of hate crime, who struggle with limited knowledge of hate crime processes and how to support their members, we will provide a comprehensive electronic guidebook that improves their understanding of the reporting and post-reporting process, allowing them to better support their community members and increase visibility on hate crime.
Open sourced under the MIT license
Data Collator
Joshua, Dylan, Stephen, Luke, Jess
π Winner: Best data mashup (Clue)
Problem
We have aimed to solve the problem presented by Inclusion London that it is very difficult to see the full picture of the impact of hate crimes against disabled people as there are many different DDPOs with their own processes for collecting data.
Solution
The Data Collator aims at allowing all DDPOs to submit data to the same place via a highly usable and accessible form. The data is then collated and will be presented through a dashboard (WIP) to give a more accurate representation of the impact of all of these hate crimes.
Open sourced, license TBC
Hate Hawk
Chris, James, Lewis
Problem
Some hate speech is lost when the original poster deletes it, some is lost when social networks delete it. Hate Hawk helps with the problem of discovery and retention of useful data for researchers and investigators.
Solution
Hate Hawk captures a view of the event. It's the MVP of capture - by allowing anyone to draw the bot's attention, anyone can highlight hate speech and ensure it's recorded for research, triage, or action - depending on how projects further downstream wish to use the data. It's white-labeled, so any number of accounts can be set up on Twitter to do this, allowing localisation and customisation for communities and agencies.
Open sourced, license TBC
Hate Speech Identifier
Oj, Bettina, Sam, Si Ning
π Honourable mention: Best use of AI / ML (AWS)
Problem
Our goal is to understand the seasonal trends of Covid-related hate speech faced by academics. Understanding these trends is crucial for developing effective, timely strategies of providing relevant support and tackling the growing culture of distrust against experts.
Solution
We scraped six years of Twitter data from a test sample of 42 UCL Covid-19 experts. Using Python, we classified each tweet as hateful, offensive, or neither, and visualised this data in a series of Tableau dashboards. From these dashboards, we were able to identify some fascinating patterns of hate speech fluctuations over the past year.
BeNice
Sam, Mike
π Winner: Intervention into Hate Crime
π Winner: Increasing awareness of Hate Crime
Problem
Tweets that are offensive/hateful could cause emotional harm. Sometimes the "Tweeter" may not intend this tweet to be interpreted in the way that its recieved.
Solution
A browser extension to alert a "Tweeter" as to whether their tweet may be considered harmful/offensive before they tweet it.