Research on Vision based Indoor Localization Algorithms Using Deep Learning

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
With the rapid development of automatic control technology and artificial intelligence, variety kinds of robots have been widely used in various fields of social life and production, showing a good development prospect and huge market space in e-commerce, health care, emergency rescue, logistics management and other aspects. Real-time positioning the robot is the precondition for it to better finish all the work. Aiming at the indoor location problem, video image is treated as the research object, the existing convolution neural networks is experienced a series of improvement. The main work is as follows: 1. Existing indoor localization methods have bottleneck constraints such as multipath effect for Wi-Fi based methods, high cost for ultra-wide-band based methods and poor anti-interference for blue-tooth based methods and so on. To avoid these problems, vision-based indoor localization method using improved VGGNet is proposed. Firstly, the whole deployment environment is departed into several regions and each region is assigned to a location center. Then, in offline mode, the VGG16NET is pre-trained by ImageNet dataset and it is fine-tuned by image samples in our self-constructed dataset. In online mode, a real time image taking from the front RGB camera of a mobile robot is put into the fully trained and converged VGG16NET to extract the location feature. The features are then used as input to an ArcFace classifier which outputs the current location of the mobile robot. Experimental results testify the character of our algorithm. 2. The difference of image location feature is not obvious in nearby location points, especially in the center of the scene of our experimental environment. Because the size and shape of path region in image is always different in the center of the scene, semantic image is applied to extract the path region in this chapter, an improved U-Net is proposed for semantic segmentation and feature classification. A new convolutional neural network structure based on U-Net is applied and the squeeze and excitation block is added into the network for optimizing the network structure. What’s more, a weighted multilayer cross entropy loss function is proposed and applied for location classification. Experimental results show that compared other methods, our method has more accurate localization results, especially in location points with obvious path features. 3. To make full use of both benefits in RGB image based location feature and semantic image based location feature, a combined convolutional neural network (Comb-Net) is proposed. The network is composed of an intact U-Net, two first 13 layers of VGG16Net, a fully connection layers of VGG16Net and an ArcFace classifier. U-Net is applied to extract semantically segmented image from RGB image, two first 13 layers of VGG16Nets are used to extract location features from RGB images and semantically segmented images, respectively. These location features are then combined together by the fully connection layers of VGG16Net, ArcFace classifier is applied to obtain the final classification results. What’s more, a multi-layer transfer learning training method for complex convolutional neural networks is designed, transfer learning decreases the number of training set and the layered strategy makes the model easy to be trained. Experimental results show that the proposed algorithm can localize indoor mobile robot accurately, compared to RGB image based method and semantic image based method, the accuracy of our method increased by 10.7% and 11.8%, respectively. 4. To obtain high accuracy indoor localization with limited dataset, an improved ResNet is proposed. We rectify the original ResNet to avoid frequently fluctuate of the classification result with little change of the weight parameters. Based on the classical residual network, batch normalization, adapti