Multi-Document Summarisation from Heterogeneous Software Development Artefacts

dc.contributor.advisorA/Prof Markus Wagner
dc.contributor.authorMahfouth Ahmad Alghamdi
dc.date2022
dc.date.accessioned2022-06-04T18:20:51Z
dc.date.available2022-02-08 15:50:43
dc.date.available2022-06-04T18:20:51Z
dc.description.abstractSoftware engineers create a vast number of artefacts during project development; activities, consisting of related information exchanged between developers. Sifting a large amount of information available within a project repository can be time-consuming. In this dissertation, we proposed a method for multi-document summarisation from heterogeneous software development artefacts to help software developers by automatically generating summaries to help them target their information needs. To achieve this aim, we first had our gold-standard summaries created; we then characterised them, and used them to identify the main types of software artefacts that describe developers’ activities in GitHub project repositories. This initial step was important for the present study, as we had no prior knowledge about the types of artefacts linked to developers’ activities that could be used as sources of input for our proposed multi-document summarisation techniques. In addition, we used the gold-standard summaries later to evaluate the quality of our summarisation techniques. We then developed extractive-based multi- document summarisation approaches to automatically summarise software development artefacts within a given time frame by integrating techniques from natural language processing, software repository mining, and data-driven search-based software engineering. The generated summaries were then evaluated in a user study to investigate whether experts considered that the generated summaries mentioned every important project activity that appeared in the gold-standard summaries. The results of the user study showed that generating summaries from different kinds of software artefacts is possible, and the generated summaries are useful in describing a project’s development activities over a given time frame. Finally, we investigated the potential of using source code comments for summarisation by assessing the documented information of Java primitive variables in comments against three types of knowledge. Results showed that the source code comments did contain additional information and could be useful for summarisation of developers’ development activities.
dc.format.extent135
dc.identifier.other110068
dc.identifier.urihttps://drepo.sdl.edu.sa/handle/20.500.14154/63932
dc.language.isoen
dc.publisherSaudi Digital Library
dc.titleMulti-Document Summarisation from Heterogeneous Software Development Artefacts
dc.typeThesis
sdl.degree.departmentComputer Science
sdl.degree.grantorThe University of Adelaide / Faculty of Engineering, Computer and Mathematical Sciences
sdl.thesis.levelDoctoral
sdl.thesis.sourceSACM - Australia
Files
Collections