arXiv Analytics

Sign in

arXiv:2203.02505 [cs.LG]AbstractReferencesReviewsResources

ARM 4-BIT PQ: SIMD-based Acceleration for Approximate Nearest Neighbor Search on ARM

Yusuke Matsui, Yoshiki Imaizumi, Naoya Miyamoto, Naoki Yoshifuji

Published 2022-03-03Version 1

We accelerate the 4-bit product quantization (PQ) on the ARM architecture. Notably, the drastic performance of the conventional 4-bit PQ strongly relies on x64-specific SIMD register, such as AVX2; hence, we cannot yet achieve such good performance on ARM. To fill this gap, we first bundle two 128-bit registers as one 256-bit component. We then apply shuffle operations for each using the ARM-specific NEON instruction. By making this simple but critical modification, we achieve a dramatic speedup for the 4-bit PQ on an ARM architecture. Experiments show that the proposed method consistently achieves a 10x improvement over the naive PQ with the same accuracy.

Related articles: Most relevant | Search more
arXiv:2501.10479 [cs.LG] (Published 2025-01-16)
Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search
arXiv:1910.08322 [cs.LG] (Published 2019-10-18)
Supervised Learning Approach to Approximate Nearest Neighbor Search
arXiv:2410.18926 [cs.LG] (Published 2024-10-24)
LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search