Leveraging Social Media Data for Detection and Monitoring of Depression

Alhamed, Falwah Abdulaziz

Leveraging Social Media Data for Detection and Monitoring of Depression

dc.contributor.advisor	Specia, Lucia
dc.contributor.advisor	Specia, Lucia
dc.contributor.author	Alhamed, Falwah Abdulaziz
dc.date.accessioned	2025-11-08T21:42:53Z
dc.date.issued	2025
dc.description.abstract	Mental health disorders are increasingly prevalent, with depression being the most common and a significant cause of disability and suicide worldwide. Understanding its symptoms, severity, and progression is vital for improving early detection and intervention. This thesis adopts a data-driven AI approach, constructing a large, expert-annotated dataset and developing models to monitor depression from social media language. We first design a data collection and curation framework to build a large-scale dataset of posts from individuals who self-report depression. In collaboration with psychiatrists and psychologists, we create an annotation scheme for labelling symptoms and severity over time. Experienced psychologists annotate the data, resulting in DepSy, the largest English dataset of 40,000 posts fully annotated for depression symptoms and severity progression. This dataset underpins all subsequent experiments. We then benchmark multiple NLP approaches to classify posts written before versus after a reported depression diagnosis. Analyses include linguistic patterns, emotion usage, and content variation. Among the models tested, BERT-based classifiers achieve the best overall performance, while large language models (LLMs) in zero-shot settings perform near-randomly. Next, we address symptom detection as a multi-label classification problem. A bespoke BERT-based model achieves strong overall results, while a fine-tuned Llama-based model, DepSy-LLaMA, obtains higher recall, identifying more positive symptom cases—a valuable property in mental health detection. However, LLM predictions remain less reliable for sensitive applications. Finally, we explore the prediction of depression severity over time using deep learning and propose a hybrid CTMC-LSTM model that integrates Markov chains with LSTM to capture temporal patterns. This model uniquely detects severe cases and achieves the highest performance across all baselines. The findings demonstrate the importance of temporal modelling and expert-annotated data for building robust, ethical, and clinically informed systems for depression monitoring from social media.
dc.format.extent	290
dc.identifier.citation	Althobaiti, F. F. (2025). Development of Electric Vehicle Drive System Using Model Predictive Control. Master’s thesis, Queen Mary University of London.
dc.identifier.issn	2558287
dc.identifier.uri	https://hdl.handle.net/20.500.14154/76906
dc.language.iso	en
dc.publisher	Saudi Digital Library
dc.subject	NLP
dc.subject	Depression Detection
dc.subject	BERT
dc.subject	Llama
dc.subject	GPT
dc.subject	Depression Symptoms
dc.subject	Social Media
dc.subject	Annotation Scheme
dc.subject	Dataset
dc.title	Leveraging Social Media Data for Detection and Monitoring of Depression
dc.title.alternative	Creating a Concept and Modelling a Lean Management Simulation Game in Plant Simulation
dc.type	Thesis
sdl.degree.department	Department of Computing
sdl.degree.discipline	Computer Science - Artificial Intelligence - Natural Language Processing
sdl.degree.grantor	Imperial College London
sdl.degree.name	Doctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SACM-Dissertation.pdf
Size:: 23.95 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

SACM - United Kingdom