Toward Accurate Motion Estimation with Affine Correspondences

No Thumbnail Available

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Motion estimation using visual data is an essential aspect of today's self-driving cars, autonomous drones, robotics, and many other similar applications. Most proposed algorithms use either feature-based methods, which look at special features in images; or, direct methods, which look at the whole image pixels. Feature-based algorithms are mostly used with traditional descriptors, such as SIFT, SURF, Hessian, or Harris, to extract point-wise features and then match those features across the image frames in the camera's video input. Recently, local affine features (LAFs) have become a hot topic in many 3D vision applications, including relative pose estimation and Structure from Motion (SfM). This is because they are more informative than point features. The clear benefit to using affine correspondences (ACs) is the fact that they add three linear constraints on the Fundamental matrix F, hence fewer correspondences are required. This means the iterative, random sampling process of RANSAC will take fewer iterations to find the inlier matches. Therefore, the use of ACs improves performance by reducing the computation time, while maintaining similar accuracy. In this work, we focus on making motion estimation with affine correspondences more accurate, robust, and applicable to a wider range of setups. We also extend the epipolar constraint on affine correspondences to the multi-camera setting. Then, we propose a new solution for multi-camera motion estimation with affine correspondences using this Our new affine-based solver outperforms its point-based counterpart while maintaining similar accuracy. A statistical analysis of the experimental results on synthetic and real data confirms this conclusion. However, affine correspondences are known to be more vulnerable to noise than point correspondences. Current photometric refinement methods are costly; making such solutions unsuitable for real-time applications. We propose a new method to correct affine correspondences without applying expensive refinement methods. Results on synthetic data and with data from a common benchmark dataset show a significant improvement in the performance of motion estimation using this new method. In addition, we propose a deep neural network to eliminate the high computation requirement of estimating, refining, and optimizing affine correspondences. The new approach is a feed-forward, non-iterative one that eliminates this requirement. We built a new dataset to train the network to learn to estimate the affine transformation giving two image patches. The proposed network was able to estimate the affine transformation very quickly. This work is a step forward toward making motion estimation using affine correspondences more efficient and accurate, as well as toward establishing a method that is applicable to a wider array of setup scenarios.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025