Marshall Ma, XiaogangAlkomah, Fatimah2024-02-262024-02-262023https://hdl.handle.net/20.500.14154/71499Hate speech is a toxic discourse that results from prejudices or conflicts between different groups within and across societies that could lead to episodes that quickly proliferate on social media. Hate speech affects people and culture as frequently as it is disseminated rapidly on social media. Consequently, when the number of social media users (Twitter, for example) increases, the effect of hate speech might be significant owing to the ease of users’ anonymity. Several machine learning models have been suggested to identify hate speech on social media; nevertheless, many difficulties have limited existing techniques. One difficulty is the multiple comprehensions of hate speech structures, resulting in many speech categories and interpretations. In addition, existing machine learning algorithms lack universality owing to the use of tiny datasets and the incorporation of a few characteristics of hate speech. Most hate speech systems focus on n-grams, part-of-speech tags, and sentiments, while some utilize lexicons as additional criteria. This research is motivated primarily by a desire to safeguard members of diverse groups, faiths, and identities against harassment, sarcasm, and harm. Additionally, the work will be helpful in social media, where offensive information may be immediately banned. The purpose of this research is to (1) identify and extract hate speech textual features from literature, (2) study and analyze current benchmark datasets for hate speech detection, and (3) develop a machine learning model for textual hate speech detection based on new proposed feature sets. The generic approach proposed here is a multi-label classification model based on a previous Twitter dataset of 150k tweets, called the multimodal hate speech dataset (MMHS150K). The tweet text was taken for further preprocessing, such as stop word removal, lower casing, emoji preprocessing, and others. Literature has several features; therefore, selecting a subset of these features is crucial to developing a successful hate speech detection model. Thereby, three groups of features were taken into consideration. These features are (1) Feature set 1: counts of hashtags, usernames, emojis, and URLs, (2) Feature set 2: inverse- document-term frequency and word embeddings features, and (3) Feature set 3: a set of psychological traits features based on the Linguistic Inquiry and Word Count (LIWC). This research assesses several machine learning techniques (f1-measure, accuracy) using the dataset and compares results with previous works. The methods include adopting an unseen set of tweets (as a case study) to validate the best-performing machine learning model. The proposed approach is carried out over these machine learning models: Naïve Bayes (NB), Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), Random Forest (RF), K-Nearest Neighbors (KNN), Decision Trees (DT), Convolutional Neural Networks (CNN), Long-term memory (LSTM), iii iii and Bidirectional Encoder Representations from Transformer (BERT). Results indicate that RF and BERT are the most effective approaches for identifying hate speech content. Also, results indicate that the most practical features of hate speech detection algorithms include psychological characteristics (i.e., LIWC) and word embedding characteristics. The findings suggested that most trained models' f1-measure for binary categorizing hate speech was over 95%. The best-proposed machine learning model (BERT) on natural and unseen examples was able to classify 70% of the examples correctly on features set of LIWC and word embeddings. Therefore, the proposed BERT model instantly detects hate speech on social networks like Twitter. As anticipated, the built machine learning models demonstrated that binary classification yields satisfactory results but lacks further improvements to multi-label classification. Implications: The findings certify the complexity of hate speech detection due to its broad range scope of different definitions. The new work provides implications to theory with newly adapted machine learning models that could be used on unseen data on Twitter or similar social media platforms. The newly trained model might be helpful for Twitter algorithms, while the new feature combinations could also be useful for other research in natural language processing. It is concluded that multi-label classification remains complicated owing to a paucity of datasets and the different definitions of hate speech. Therefore, a substantial study is required to generate features that perform well with varied datasets and conceptions of hate speech with several facets. Furthermore, in the literature, no guidelines guarantee that hate speech detection algorithms are effectively compared across various datasets. Therefore, it may be good to supplement the current dataset with additional hate speech keywords. The proposed model would then need retraining due to the emergence of new phrases and the cessation of obsolete terms by users over time.124en-USHate speechclassificatio speech detectionnspeech detectionTWITTER HATE SPEECH DETECTION BASED ON DEEP LEARNING METHODSThesis