How Hate Can Be Detected Reliably
The most important prerequisite to reliably counter hate comments on the internet is to find them in a fast and reliable manner. The project Stop Hate Speech uses the Bot Dog Algorithm which had been trained in collaboration with researchers at ETH Zurich and University of Zurich. A study that outlines the development and training of the algorithm in detail was presented and published at the end of 2022 during an international symposium (EMNLP 2022).
Bot Dog is based on a unique dataset that overall includes over 420,000 online comments in German and French. These were annotated by an online community and research students on whether they contained hate and if so, who the hate was addressed towards (find out more about the definition of hate speech). The large amount of examples allowed the algorithm to learn to detect German and French hate speech on Swiss media platforms and on Twitter.
The algorithm was developed over the span of a year and multiple iterations by a research team from ETH and University of Zurich under the supervision of Ana Kotarcic. As a base a simple statistical classification system was used. The approach allows for comments that likely contain hate to be detected in significantly smaller data sizes. It is crucial to constantly supervise the algorithm's accuracy at each iteration through human eyes (human-in-the-loop). They ensure that there is a flow of new annotated online comments with which Bot Dog can keep on learning.
As the amount of annotated online comments kept on growing with each iteration, the Bot Dog algorithm was being trained with the latest machine learning transformer models. Compared to other multilingual classification algorithms for hate speech Bot Dog is the most accurate one. So far it is also the only algorithms that has been specifically designed for the Swiss context.
However; the study also shows that hate on the internet is constantly changing. An algorithm that may work well today might be much less effective in only a few months’ time. This highlights how important is it to continuously re-train Bot Dog with current data. Within the project Hate Speech, the algorithm allows for regular monitoring of hate speech and moderator's decisions. This, for example, facilitates timely counter speech to effectively confront hate (see more Counter speech against Hate speech). Therefore, it is not only important to quickly identify hate comments but also that this identification happens reliably within the current events of the day.
Study: Ana Kotarcic, Dominik Hangartner, Fabrizio Gilardi, Selina Kurer and Karsten Donnay. (2022). Human-in-the-Loop Hate Speech Classification in a Multilingual Context. In Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.