Understanding Ransomware and Enhancing Their Detection Using Machine Learning
| dc.contributor.advisor | Xiao, Yang | |
| dc.contributor.author | Alzahrani, Saleh | |
| dc.date.accessioned | 2026-05-12T11:08:34Z | |
| dc.date.issued | 2026 | |
| dc.description.abstract | Ransomware attacks have escalated significantly in recent years, causing substantial financial losses and operational disruptions to individuals, organizations, and critical infrastructure worldwide. According to The Chainalysis 2024 Crypto Crime Report, ransomware attacks have imposed increasing financial burdens on victims over recent years. The total value received by ransomware attackers reached $1.1 billion in 2023, representing a significant rise from $567 million in 2022. This trend highlights the evolving threat posed by ransomware as attackers continue to refine their methods. Compared to $220 million in 2019. Despite the proliferation of detection methods, contemporary ransomware continues to evade traditional security measures through increasingly sophisticated evasion techniques. This dissertation addresses critical gaps in ransomware detection research through a investigation that combines in-depth malware analysis, evolutionary tracking, systematic literature review, novel detection methodology, and dataset development. The research begins with a detailed examination of Conti ransomware, one of the most notorious Ransomware-as-a-Service operations that caused approximately $45 million in damages and significantly impacted healthcare systems. Through analysis of leaked source code and controlled environment testing, this study reveals advanced evasion mechanisms including API disguise techniques, anti-hook mechanisms, and multithreaded encryption for rapid file encryption. Building upon this foundation, the research tracks Conti's evolution from its beta version through multiple iterations, categorizing samples into seven distinct versions. This longitudinal analysis demonstrates that modern ransomware success stems from continuous development and delivery practices, with features such as API hashing and runtime API loading being progressively integrated over time. To contextualize these findings within the broader detection landscape, a survey of existing ransomware detection methods was conducted, examining both machine learning and non-machine learning approaches alongside available datasets. This survey identifies critical limitations in current research, specifically that non-machine learning methods fail to identify new samples from known variants, while machine learning approaches suffer from inadequate model design and the absence of comprehensive, standardized datasets. These deficiencies severely limit their effectiveness against emerging ransomware variants. Addressing these identified gaps, this dissertation introduces RansomFormer, a Transformer-based detection model that leverages cross-attention mechanisms to fuse Portable Executable byte data with Application Programming Interface information, including both static imports and dynamic sequence calls. Unlike existing single-feature approaches that ransomware developers can circumvent, RansomFormer's multi-modal architecture achieves exceptional accuracy of 99.25% on static datasets and 99.50% on combined static-dynamic datasets across more than 150 ransomware families. Furthermore, recognizing the fundamental need for comprehensive training data, this dissertation presents RanDS, a rigorously curated dataset comprising a large collection of ransomware samples spanning hundreds of families alongside a substantial set of benign samples, collected and verified over multiple years from an initial corpus of millions of malware files. RanDS includes several processed feature extraction datasets encompassing static raw strings, English strings, imported and exported APIs, demangled APIs, and dynamic behavioral activities, all made publicly available. This dissertation makes contributions to cybersecurity by providing deep insights into modern ransomware operations, demonstrating the importance of evolutionary analysis in understanding threat progression, and delivering both an detection methodology and a foundational dataset that addresses longstanding research limitations in the field. | |
| dc.format.extent | 225 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14154/78961 | |
| dc.language.iso | en_US | |
| dc.publisher | Saudi Digital Library | |
| dc.subject | Ransomware | |
| dc.subject | Ransomware Detection | |
| dc.subject | Machine Learning | |
| dc.subject | Malware | |
| dc.subject | Malware Detection | |
| dc.subject | ML | |
| dc.subject | Computer Security | |
| dc.subject | Cybersecurity | |
| dc.subject | AI | |
| dc.subject | Dataset | |
| dc.subject | Ransomware Dataset | |
| dc.title | Understanding Ransomware and Enhancing Their Detection Using Machine Learning | |
| dc.type | Thesis | |
| sdl.degree.department | Computer Science | |
| sdl.degree.discipline | Computer Science | |
| sdl.degree.grantor | The University of Alabama | |
| sdl.degree.name | Doctor of Philosophy |
