LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING

Alshadadi, Abdullah Turki

LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING

Files

SACM-Dissertation.pdf (377.67 KB)

Date

2024

Authors

Alshadadi, Abdullah Turki

Publisher

Newcastle University

Abstract

The advancement of autonomous vehicles has garnered significant attention, particularly in the development of complex software stacks that enable navigation, decision-making, and planning. Among these, the Perception [1] component is critical, allowing vehicles to understand their surroundings and maintain localisation. Simultaneous Localisation and Mapping (SLAM) plays a key role by enabling vehicles to map unknown environments while tracking their positions. Historically, SLAM has relied on heuristic techniques, but with the advent of the "Perception Age," [2] research has shifted towards more robust, high-level environmental awareness driven by advancements in computer vision and deep learning. In this context, MLRefineNet [3] has demonstrated superior robustness and faster convergence in supervised learning tasks. However, despite its improvements, MLRefineNet struggled to fully converge within 200 epochs when integrated into SfmLearner. Nevertheless, clear improvements were observed with each epoch, indicating its potential for enhancing performance. SfmLearner [4] is a state-of-the-art deep learning model for visual odometry, known for its competitive depth and pose estimation. However, it lacks high-level understanding of the environment, which is essential for comprehensive perception in autonomous systems. This paper addresses this limitation by introducing a multi-modal shared encoder-decoder architecture that integrates both semantic segmentation and depth estimation. The inclusion of high-level environmental understanding not only enhances scene interpretation—such as identifying roads, vehicles, and pedestrians—but also improves the depth estimation of SfmLearner. This multi-task learning approach strengthens the model’s overall robustness, marking a significant step forward in the development of autonomous vehicle perception systems.

Keywords

Artificial Intelligence, Deep Learning, Machine Learning, Computer Vision, SLAM, Autonomous Vehicles, Semi-supervised Learning, Multi-modal Model

URI

https://hdl.handle.net/20.500.14154/74427

Collections

SACM - United Kingdom

Full item page

LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By