On the Effect of Rendering and Randomisation for Visual Sim-to-Real Transfer

Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Recent trends in robot learning methods have seen a shift towards deep learning-based solutions, with the aim of teaching robots various manipulation skills from visual inputs in an end-to-end manner. Nonetheless, deep learning models are known to be sample-inefficient since a massive amount of training images are required for the network to fit the data distribution. This presents a bottleneck in robotics as real-world data collection is difficult, expensive, and sometimes dangerous. To circumvent these issues, researchers have used physical simulators to collect the otherwise impossible huge datasets, given their low-cost and scalability. However, the unmodeled noise and physical properties of the real-world make it challenging to directly transfer models trained solely with synthetic data to the real-world; A problem referred to as the textit{reality gap}. Domain randomisation is a simple yet promising technique for bridging this gap, which hypothesis that if the network is exposed to a plethora of visually randomised scenes, it would be able to view the real-world environment as just a new variation. Several works have shown the potential of domain randomisation in the field of robotics, which resulted in multiple successful transfers of manipulation skills to the real-world. To the best of our knowledge, however, none of these works has investigated the impact of the physical simulator fidelity to the overall model sim-to-real transfer performance. In this thesis, we present a comprehensive empirical study on the effect of simulation quality on the models transferability to the real-world. We show that models trained with high-quality simulated scenes are capable of transferring to the real-world with minimal error compared to their low-quality counterparts. Furthermore, we thoroughly discuss the results of several ablation studies conducted to understand the models sensitivity to different randomisation settings. Lastly, we show a successful transfer of a 6D pose estimation model to the real-world, without being exposed to a single real-world training example.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025