Evaluating Text Summarization with Goal-Oriented Metrics: A Case Study using Large Language Models (LLMs) and Empowered GQM

Altamimi, Rana

Evaluating Text Summarization with Goal-Oriented Metrics: A Case Study using Large Language Models (LLMs) and Empowered GQM

Files

SACM-Dissertation.pdf (1002.98 KB)

Date

2024-09

Authors

Altamimi, Rana

Publisher

University of Birmingham

Abstract

This study evaluates the performance of Large Language Models (LLMs) in dialogue summarization tasks, focusing on Gemma and Flan-T5. Employing a mixed-methods approach, we utilized the SAMSum dataset and developed an enhanced Goal-Question-Metric (GQM) framework for comprehensive assessment. Our evaluation combined traditional quantitative metrics (ROUGE, BLEU) with qualitative assessments performed by GPT-4, addressing multiple dimensions of summary quality. Results revealed that Flan-T5 consistently outperformed Gemma across both quantitative and qualitative metrics. Flan-T5 excelled in lexical overlap measures (ROUGE-1: 53.03, BLEU: 13.91) and demonstrated superior performance in qualitative assessments, particularly in conciseness (81.84/100) and coherence (77.89/100). Gemma, while showing competence, lagged behind Flan-T5 in most metrics. This study highlights the effectiveness of Flan-T5 in dialogue summarization tasks and underscores the importance of a multi-faceted evaluation approach in assessing LLM performance. Our findings suggest that future developments in this field should focus on enhancing lexical fidelity and higher-level qualities such as coherence and conciseness. This study contributes to the growing body of research on LLM evaluation and offers insights for improving dialogue summarization techniques.

Keywords

Artificial Intelligent g, Large Language Models, Goal-Question-Metric, Natural language processing, Software Engineering

URI

https://hdl.handle.net/20.500.14154/73494

Collections

SACM - United Kingdom

Full item page

Evaluating Text Summarization with Goal-Oriented Metrics: A Case Study using Large Language Models (LLMs) and Empowered GQM

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By