EXTRACTION OF TEMPORAL RELATIONSHIPS BETWEEN EVENTS FROM NEWS ARTICLES FOR TIMELINE GENERATION

Batista- Navarro, RizaAlsayyahi, Sarah2024-07-012024-07-010024-06-27https://hdl.handle.net/20.500.14154/72434Extracting temporal information from natural language texts is crucial for understanding the sequence and context of events, enhancing the accuracy of timeline generation and event analysis in various applications. However, within the NLP community, determining the temporal ordering of events has been recognised as a challenging task. This difficulty arises from the inherent vagueness of temporal information found in natural language texts like news articles. In Temporal Information Extraction (TIE), different datasets and methods have been proposed to extract various types of temporal entities, including events, temporal expressions, temporal relations, and the relative order of events. Some of these tasks have been considered easier than others in the field. For instance, extracting the temporal expressions or events is easier than determining the optimal order of a set of events. The complexity of determining the event order arises due to the requirement of commonsense and external knowledge, which is not readily accessible to computers. In contrast, humans can effortlessly identify this chronological order by relying on their external knowledge and understanding to establish the most appropriate sequence. In this thesis, our goal was to improve the performance of state-of-the-art methods for determining the temporal order of events in news articles. Accordingly, we present the following contributions: 1. We reviewed the literature by conducting a systematic survey, categorising tasks and datasets relevant to extracting the order of events mentioned in the news articles. We also identified existing findings and highlighted some research directions worth further investigation. 2. We proposed a novel annotation scheme with an unambiguous definition of the types of events and temporal relations of interest. Adopting this scheme, we developed a TIMELINE dataset, which annotates both verb and nominal events and considers the long-distance temporal relations between events separated by more than one sentence. 3. We integrated problem-related features with a neural-based method to improve the model's ability to extract temporal relations that involved nominal events and the temporal relations with small classes (e.g., EQUAL class). We found that integrating these features has significantly improved the performance of the neural baseline model and could achieve state-of-the-art results in two datasets in the literature. 4. We proposed a framework that uses local search algorithms (e.g., Hill Climbing and Simulated Annealing) to generate document-level timelines from a set of temporal relations. These algorithms have improved the performance of the current models and resolved the problem in less time than the state-of-the-art models.139enNatural Language ProcessingEXTRACTION OF TEMPORAL RELATIONSHIPS BETWEEN EVENTS FROM NEWS ARTICLES FOR TIMELINE GENERATIONThesis