Automatic Generation of a Coherent Story from a Set of Images

Aljawy, Zainy

Automatic Generation of a Coherent Story from a Set of Images

Date

2023-12

Authors

Aljawy, Zainy

Publisher

Saudi Digital Library

Abstract

This dissertation explores vision and language (V&L) algorithms. While (V&L) succeeds in image and video captioning tasks, the dynamic Visual Storytelling Task (VST) remains challenging. VST demands coherent stories from a set of images, requiring grammatical accuracy, flow, and style. The dissertation addresses these challenges. Chapter 2 presents a framework utilizing an advanced language model. Chapters 3 and 4 introduce novel techniques that integrate rich visual representation to enhance generated stories. Chapter 5 introduces a new storytelling dataset with a comprehensive analysis. Chapter 6 proposes a state-of-the-art Transformer-based model for generating coherent and informative story descriptions from image sets.

Keywords

Storytelling, Sequential Vision Understanding, Computer Vision, image and video captioning, Deep Learning, Transformer, Advanced Language Model

Citation

Zainy M. Malakan. (2023). Automatic Generation of a Coherent Story from a Set of Images. [Doctoral Thesis, The University of Western Australia].

URI

https://hdl.handle.net/20.500.14154/71156

Collections

SACM - Australia

Full item page

Automatic Generation of a Coherent Story from a Set of Images

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By