Religious Hatred in Arabic Social Media: Analysis, Detection, and Personalization
Abstract
Middle Eastern societies have long suffered from civil wars and domestic tensions that are partly caused by conflicting religious beliefs. This thesis examines the extent of religious hate in Arabic social media, evaluates the impact of automated accounts (i.e., bots) and personalized recommendation algorithms on its spread, and investigates social computing methods for automatically recognizing Arabic-language content and bots promoting religious hatred. First, the thesis addresses the scarcity of Arabic resources in the field by creating two publicly available, annotated Arabic datasets for Twitter and YouTube through crowdsourcing. It then presents a comprehensive analysis highlighting the prevalence of religious hatred on Arabic social networks, the most targeted religious groups, the unique characteristics of perpetrators, and the distinctions between Twitter and YouTube in terms of hate speech volume and targeted groups. Based on gathered insights, it then develops and evaluates several supervised machine learning models to automatically and efficiently detect hateful content. This thesis also contributes new insights into the role of Arabic-language bots in spreading religious hatred on Twitter and introduces a novel regression model tailored to detect Arabic-tweeting bots. Finally, the thesis audits YouTube’s recommendation algorithm to assess the effect of personalization based on demographics and watch history on the extent of hateful content recommended to users. The research presented in this thesis offers practical implications for platform designers to facilitate enforcing their policy against hate and malicious automation and contributes to the broader effort to combat online radicalization.
Description
Keywords
Hate Speech, Detection, Arabic NLP, Machine Learning, Data Mining, Algorithmic Auditing