A Hierarchical Extractive Text Summarization Approach

SAUD SHARI SAUD ALSHAHRANI

A Hierarchical Extractive Text Summarization Approach

Authors

Abstract

Abstract The explosion of information has made it difficult for users to track information about topics that is of interest to them. Automatic text summarization is an important tool in this regard. In this research, we focus on generating various complementary extractive summaries, each selecting the most relevant sentences in a given document from one or more points of view. Towards this end, we have developed a toolbox and a notation that allows the construction of complicated summarizers as networks of tools or blocks. The basic blocks are Keyword Extractor (KE), Sentence Extractor (SE), Semantic Grader (SG), Keyword Emphasizer (KM), Sentence Orderer (SO), in addition to well known preprocessing tools such as preprocessor, stemmer, stop word removal, etc. Moreover, we introduced a notation and an approach that allows algebraic operations on documents and keyword lists such as addition and subtraction, whether at the level of words, sentences, or documents. As for the performance evaluation of these summarizers, we have applied some of the considered standards in this field as Average Rouge Summarization Metric (ARSM). At the same time, we have also advocated the use of Effective Keyword Extraction (EKE) and Weighted Effective Keyword Extraction (WEKE). Also, we have developed test procedures for our summarization methods to ensure their effectiveness, as shown for each method of summarization in its place. The summarization methods proposed and developed focus primarily on dealing with the ideas within these documents, with ideas being represented by keywords or combinations thereof. For example, we have proposed methods that search for common ideas within several documents, as in the Fuse Document First (FDF) and the Fuse Keywords First (FKF).The FDF algorithm focuses on fusing documents first, while the FKF focuses on fusing keywords first. Moreover, we propose and develop methods that allow for secondary ideas inside the document to emerge separately and in addition to the primary ideas. Two algorithms, the Subtract-Document (SD) algorithm and the Stratified Keywords Portioning (SKP) are algorithms are introduced and compared. Moreover, we have created some methods that summarize text documents based on the interaction between the ideas inside these documents as in the Comprehensive-View Summarizer (CVS) and the Mutual View Summarizer (MVS).

URI

https://drepo.sdl.edu.sa/handle/20.500.14154/61618

Collections

SACM - United States of America

Full item page

A Hierarchical Extractive Text Summarization Approach

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By