LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING

Alshadadi, Abdullah Turki

LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING

dc.contributor.advisor	Holder, Chris
dc.contributor.author	Alshadadi, Abdullah Turki
dc.date.accessioned	2024-12-24T07:43:08Z
dc.date.issued	2024
dc.description.abstract	The advancement of autonomous vehicles has garnered significant attention, particularly in the development of complex software stacks that enable navigation, decision-making, and planning. Among these, the Perception [1] component is critical, allowing vehicles to understand their surroundings and maintain localisation. Simultaneous Localisation and Mapping (SLAM) plays a key role by enabling vehicles to map unknown environments while tracking their positions. Historically, SLAM has relied on heuristic techniques, but with the advent of the "Perception Age," [2] research has shifted towards more robust, high-level environmental awareness driven by advancements in computer vision and deep learning. In this context, MLRefineNet [3] has demonstrated superior robustness and faster convergence in supervised learning tasks. However, despite its improvements, MLRefineNet struggled to fully converge within 200 epochs when integrated into SfmLearner. Nevertheless, clear improvements were observed with each epoch, indicating its potential for enhancing performance. SfmLearner [4] is a state-of-the-art deep learning model for visual odometry, known for its competitive depth and pose estimation. However, it lacks high-level understanding of the environment, which is essential for comprehensive perception in autonomous systems. This paper addresses this limitation by introducing a multi-modal shared encoder-decoder architecture that integrates both semantic segmentation and depth estimation. The inclusion of high-level environmental understanding not only enhances scene interpretation—such as identifying roads, vehicles, and pedestrians—but also improves the depth estimation of SfmLearner. This multi-task learning approach strengthens the model’s overall robustness, marking a significant step forward in the development of autonomous vehicle perception systems.
dc.format.extent	9
dc.identifier.uri	https://hdl.handle.net/20.500.14154/74427
dc.language.iso	en
dc.publisher	Newcastle University
dc.subject	Artificial Intelligence
dc.subject	Deep Learning
dc.subject	Machine Learning
dc.subject	Computer Vision
dc.subject	SLAM
dc.subject	Autonomous Vehicles
dc.subject	Semi-supervised Learning
dc.subject	Multi-modal Model
dc.title	LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING
dc.type	Thesis
sdl.degree.department	School of Computing
sdl.degree.discipline	Data Science (with Specialisation in Artificial Intelligence)
sdl.degree.grantor	Newcastle University
sdl.degree.name	Master of Science

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SACM-Dissertation.pdf
Size:: 377.67 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

SACM - United Kingdom