Fake News Detection on Social Media: Methods and Techniques

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

University of Leeds

Abstract

This thesis explores a number of new techniques for detecting fake news on social media and culminates with proposing a comprehensive fact-checking system. Although the main focus is on the Arabic language, the proposed methods and techniques are adaptable to other languages. The effectiveness of these methods has been evaluated through various experiments using data from social media platforms such as Twitter and other sources of potential misinformation. A literature review of studies in the field reveals a significant amount of research focused on applying AI methods to automate the detection of online fake news. However, current research in this direction has several weaknesses in various areas, including challenges in the entire detection pipeline, limitations in the employed methodologies, and inadequacies in existing datasets. The thesis begins by introducing a novel approach to simulating interactions on social networks, which enables the tracking of user behaviours and the propagation of news to assess credibility. It then examines four different techniques that can be considered for building an automatic fact-checking system and concludes with the proposal of a hybrid unified pipeline. The first technique focuses on classifying claims based solely on their content. To evaluate this approach, three studies were conducted using different methods. The proposed methods demonstrated promising results, which achieved a macro F-score of 0.339 in the third study. These findings suggest that content-based techniques can be improved by incorporating additional information. The second technique expands claim classification by incorporating both content and additional external information. New structured methodologies were developed to extract potential features rather than relying solely on claim content. Examples of such methodologies include identifying sarcastic or hateful comments. Although the results showed that these features did not improve classification performance, they highlighted the potential value of such indicators. Specifically, the findings revealed that sarcasm or hate speech is nearly twice as prevalent in comments on false claims compared to true ones. The third technique aims to automate fact-checking explainability based on the content of claims and news articles. To support this approach, a new dataset, FactEx, was collected from trusted fact-checking systems. This dataset was used to fine-tune generative models, and among these, the best-performing model achieved a ROUGE score of 23.4. The fourth technique involves fact-checking claims through retrieved information. A new verification system, named Ta’keed, was developed based on this technique to fact-check Arabic claims. Additionally, a new gold-labelled test set, ArFactEx, was compiled to assess Ta’keed. An evaluation investigation reveals that the proposed system exceeded models such as AraBERT when fine-tuned on three benchmark Arabic datasets and tested on ArFactEx. It achieved an F1-score of 0.72 compared to 0.52, 0.61, and 0.54 by the other fine-tuned models. It also outperformed T5-based and AraT5-based models in generating justifications, with an average cosine similarity score of 0.76. Finally, an optimised hybrid pipeline was introduced, which incorporates information retrieval and evidence extraction to enhance the classification task. The final proposed pipeline achieved an F1-score of 0.86, which highlighted the importance of information retrieval in tackling disinformation.

Description

Keywords

Fake News Detection on Social Media, Simulating Social Media, Fake News Detection, Comments for Fake News Detection, Tracking of User Behaviours, Simulating Interactions on Social Networks, Information Retrieval in Tackling Disinformation, Evaluating Trustworthiness, Visualizing Propagations in a Network, FactEx, Generative AI, Ta’keed, Hybrid Pipeline, Claim Verification, Evidence Extraction

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025