Saudi Cultural Missions Theses & Dissertations

Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10

Browse

Search Results

Now showing 1 - 2 of 2
  • ItemRestricted
    Sentiment-Driven Health Messaging: Comparing DistilBERT and VADER on Timed Narratives with Animated Educational Videos using Manim
    (Saudi Digital Library, 2025) Alshabanah, Alozuf Mohammed; Ansari, Tayyab Ahmad
    Abstract— This project builds two educational animations using Manim and explores whether general-purpose sentiment analysis tools can help estimate the “healthiness” of child-friendly, short health education videos, by defining a script with timestamped segments labelled as Healthy or Unhealthy. The focus is on comparing a transformer-based sentiment classifier (DistilBERT SST-2) with a rule-based tool (VADER). DistilBERT links POSITIVE sentiment to Healthy and NEGATIVE to Unhealthy. VADER uses a score threshold (≥0.05 for Healthy, ≤−0.05 for Unhealthy, and values in between as Neutral). For evaluation, the “Neutral” predictions are converted to “Unhealthy”. Precision, recall, and F1 scores (per video and overall) are calculated, and confusion matrices are used to visualize the performance. In the custom dataset of 24 segments, both methods perform well when the language is clearly positive or negative. DistilBERT rarely produces Neutral labels but sometimes misinterprets healthy advice that includes negation. VADER, However, often predicts Neutral, which is penalized in the evaluation method. The study also discusses the limitations of using sentiment as a proxy for healthfulness, including the effects of class imbalance and how Neutral predictions are handled. Few potential improvements are suggested, such as domain specific model tuning, more accurate threshold settings, and treating Neutral as an undecided, or “abstain” option.
    20 0
  • ItemRestricted
    An NLP-Driven Framework for Business Email Compromise Detection and Authorship Verifcation
    (Saudi Digital Library, 2025) Almutairi, Amirah; AlHashimy, Nawfal; Kang, BooJoong
    Business Email Compromise (BEC) presents a critical cybersecurity threat, leveraging linguistic impersonation and social engineering rather than traditional malicious payloads. These attacks routinely evade conventional flters by mimicking legitimate communication styles and exploiting trusted identities. This thesis explores content-based detection strategies for BEC using a sequence of natural language processing (NLP) models. First, it proposes a transformer-based classifer to detect semantic indicators of deception in email body text. Second, it develops a Siamese authorship verifcation (AV) model that captures stylistic consistency, even under adversarial mimicry. These components are unifed within a multi-task learning (MTL) framework that simultaneously optimizes for BEC detection and AV by sharing underlying representations while preserving task-specifc objectives. To support empirical evaluation, a structured taxonomy of BEC fraud is introduced, and a synthetic email dataset is generated through prompt-guided language model fne-tuning and human validation. Experiments on combined real and synthetic corpora demonstrate that the MTL model achieves up to 97% F1-score in BEC detection and 93% in AV, outperforming transfer learning baseline while reducing false positives and computational overhead. This work contributes a principled, modular, and extensible framework for enhancing email security through joint semantic and stylistic analysis, addressing gaps in current defenses against sophisticated impersonation attacks.
    12 0

Copyright owned by the Saudi Digital Library (SDL) © 2026