LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Newcastle University

Abstract

The advancement of autonomous vehicles has garnered significant attention, particularly in the development of complex software stacks that enable navigation, decision-making, and planning. Among these, the Perception [1] component is critical, allowing vehicles to understand their surroundings and maintain localisation. Simultaneous Localisation and Mapping (SLAM) plays a key role by enabling vehicles to map unknown environments while tracking their positions. Historically, SLAM has relied on heuristic techniques, but with the advent of the "Perception Age," [2] research has shifted towards more robust, high-level environmental awareness driven by advancements in computer vision and deep learning. In this context, MLRefineNet [3] has demonstrated superior robustness and faster convergence in supervised learning tasks. However, despite its improvements, MLRefineNet struggled to fully converge within 200 epochs when integrated into SfmLearner. Nevertheless, clear improvements were observed with each epoch, indicating its potential for enhancing performance. SfmLearner [4] is a state-of-the-art deep learning model for visual odometry, known for its competitive depth and pose estimation. However, it lacks high-level understanding of the environment, which is essential for comprehensive perception in autonomous systems. This paper addresses this limitation by introducing a multi-modal shared encoder-decoder architecture that integrates both semantic segmentation and depth estimation. The inclusion of high-level environmental understanding not only enhances scene interpretation—such as identifying roads, vehicles, and pedestrians—but also improves the depth estimation of SfmLearner. This multi-task learning approach strengthens the model’s overall robustness, marking a significant step forward in the development of autonomous vehicle perception systems.

Description

Keywords

Artificial Intelligence, Deep Learning, Machine Learning, Computer Vision, SLAM, Autonomous Vehicles, Semi-supervised Learning, Multi-modal Model

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025