Automatic Generation of a Coherent Story from a Set of Images

Mian, AjamlHassan, Ghulam MubasharAljawy, Zainy2024-01-112024-01-112023-12Zainy M. Malakan. (2023). Automatic Generation of a Coherent Story from a Set of Images. [Doctoral Thesis, The University of Western Australia].https://hdl.handle.net/20.500.14154/71156This dissertation explores vision and language (V&L) algorithms. While (V&L) succeeds in image and video captioning tasks, the dynamic Visual Storytelling Task (VST) remains challenging. VST demands coherent stories from a set of images, requiring grammatical accuracy, flow, and style. The dissertation addresses these challenges. Chapter 2 presents a framework utilizing an advanced language model. Chapters 3 and 4 introduce novel techniques that integrate rich visual representation to enhance generated stories. Chapter 5 introduces a new storytelling dataset with a comprehensive analysis. Chapter 6 proposes a state-of-the-art Transformer-based model for generating coherent and informative story descriptions from image sets.138en-USStorytellingSequential Vision UnderstandingComputer Visionimage and video captioningDeep LearningTransformerAdvanced Language ModelAutomatic Generation of a Coherent Story from a Set of ImagesThesis