Roeterseilandcampus - Building G, Street: Nieuwe Achtergracht 129-B, Room: G3.03
Threatening text messages in online social media pose significant risks to individuals, groups, and societal well-being. Monitoring and research as well as removal and even prosecution of these behaviors rely on automatic detection methods to process the enormous volume of data. However, current models struggle with multilingualism, data scarcity, and semantic ambiguity. This study addresses these limitations by developing a multilingual language model that is trained on a customly designed large dataset of both authentic and synthetic social media messages to detect threats to life. In this presentation, I will discuss the dataset creation, model fine-tuning, and evaluation, highlighting both achievements and challenges.