Similarity Learning on Many Core Architectures

No Thumbnail Available

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

Saudi Digital Library

Abstract

Many recent research works have pointed out that learning metric is far better as compared to using default metrics such as Euclidean distance, cosine similarity etc. Furthermore, similarity learning that is based on cosine similarity has been proven to work better for many of the data sets, which are not necessarily textual in nature. However, similarity learning in nearest neighbor algorithms has been inherently slow, owing to their O(d3 ) complexity. In this work, we address this shortcoming and propose a similarity learning algorithm for many core architecture; whereby, Similarity Learning Algorithm (SiLA) is parallelized. A parallel variant of SiLA is introduced, which affects different parts of the algorithm, such as preprocessing step, training, validation and testing. The resulting algorithm is faster than the traditional one on large data sets because of its parallel nature. The results are confirmed on UCI data sets. In the preprocessing step, we gained up to 80X speedup with the parallel approach. While sequential algorithm takes several hours to finish training and testing (as long as 13.64 hours in case of Letter data set), its parallel version completes it many times faster (2.39 hours in case of Letter). Even though the parallel algorithm needs to do more computation during the training, it still maintains a considerable speedup compared to the sequential one. On the other hand, during the testing phase, the same data computation needs to be done for both approaches, with up to 69X speedup for the parallel algorithm.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025