Evaluating and Fine-Tuning Large Language Model-Powered Mental Health Chatbots in Arabic

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

University of Birmingham

Abstract

Recent advancements in AI tools have revolutionized the health sector in patient assessment, appointments and follow-ups. Furthermore, their role shines in evaluating and providing mental health support. Despite the numerous Mental health chatbots in English, mental health issues remain a challenging subject, especially among Arabic speakers, where there are little to no current effective chatbots. This project evaluates and fine-tunes existing large language models (LLM) to help provide accurate mental health counselling to Arabic speakers. It utilized a total of 6917 question-answer pairs collected from the CounselChat platform covering various common mental health topics that were used later in fine-tuning BLOOMz 3b and Llama2 7b LLMs. We found out that both models, in terms of statistical metrics, perform very poorly. However, model-based metrics showed good results. BLOOMz shows a promising result that reflects the model's ability to construct coherent, clear and direct answers when inference testing was done. With more careful and accurate data curation and utilizing the LLM-based evaluation framework, both BLOOMz and Llama2 can be implemented to develop real-world applications of mental health chatbots that are able to provide accurate mental health counselling to Arabic speakers.

Description

Keywords

Mental health chatbots, Large Language models, Arabic LLMs, Arabic chatbots, Llama2, BLOOMz.

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025