Automatic Generation of a Coherent Story from a Set of Images

dc.contributor.advisorMian, Ajaml
dc.contributor.advisorHassan, Ghulam Mubashar
dc.contributor.authorAljawy, Zainy
dc.date.accessioned2024-01-11T12:01:19Z
dc.date.available2024-01-11T12:01:19Z
dc.date.issued2023-12
dc.description.abstractThis dissertation explores vision and language (V&L) algorithms. While (V&L) succeeds in image and video captioning tasks, the dynamic Visual Storytelling Task (VST) remains challenging. VST demands coherent stories from a set of images, requiring grammatical accuracy, flow, and style. The dissertation addresses these challenges. Chapter 2 presents a framework utilizing an advanced language model. Chapters 3 and 4 introduce novel techniques that integrate rich visual representation to enhance generated stories. Chapter 5 introduces a new storytelling dataset with a comprehensive analysis. Chapter 6 proposes a state-of-the-art Transformer-based model for generating coherent and informative story descriptions from image sets.
dc.format.extent138
dc.identifier.citationZainy M. Malakan. (2023). Automatic Generation of a Coherent Story from a Set of Images. [Doctoral Thesis, The University of Western Australia].
dc.identifier.urihttps://hdl.handle.net/20.500.14154/71156
dc.language.isoen_US
dc.publisherSaudi Digital Library
dc.subjectStorytelling
dc.subjectSequential Vision Understanding
dc.subjectComputer Vision
dc.subjectimage and video captioning
dc.subjectDeep Learning
dc.subjectTransformer
dc.subjectAdvanced Language Model
dc.titleAutomatic Generation of a Coherent Story from a Set of Images
dc.typeThesis
sdl.degree.departmentComputer Science and Software Engineering
sdl.degree.disciplineComputer Vision and Artificial Intelligence
sdl.degree.grantorThe University of Western Australia
sdl.degree.nameDoctor of Philosophy
sdl.thesis.sourceSACM - Australia

Files

Collections

Copyright owned by the Saudi Digital Library (SDL) © 2025