TY - GEN
T1 - Parallel hashing using representative points in hyperoctants
AU - Shen, Chaomin
AU - Yu, Mixue
AU - Zhao, Chenxiao
AU - Peng, Yaxin
AU - Zhang, Guixu
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/10/17
Y1 - 2018/10/17
N2 - The goal of hashing is to learn a low-dimensional binary representation of high-dimensional information, leading to a tremendous reduction of computational cost. Previous studies usually achieved this goal by applying projection or quantization methods. However, the projection method fails to capture the intrinsic data structures, and the quantization method cannot make full use of complete information by its strategy of partitioning original space. To combine their advantages and avoid their drawbacks, we propose a novel algorithm, termed as representative points quantization (RPQ), by using the representative points defined as the barycenters of points in the hyperoctants. To settle the problem of exponential time complexity with the growth of the coding length, for long hashing codes, we further propose a parallel RPQ (PRPQ) algorithm, by separating a long code into several short codes, re-coding the short codes in different low dimensional subspaces, and then concatenating them to a long code. Experiments on image retrieval tasks demonstrate that RPQ and PRPQ can well capture the main topology structure of data, showing that our algorithm achieves better performance than state-of-the-art methods.
AB - The goal of hashing is to learn a low-dimensional binary representation of high-dimensional information, leading to a tremendous reduction of computational cost. Previous studies usually achieved this goal by applying projection or quantization methods. However, the projection method fails to capture the intrinsic data structures, and the quantization method cannot make full use of complete information by its strategy of partitioning original space. To combine their advantages and avoid their drawbacks, we propose a novel algorithm, termed as representative points quantization (RPQ), by using the representative points defined as the barycenters of points in the hyperoctants. To settle the problem of exponential time complexity with the growth of the coding length, for long hashing codes, we further propose a parallel RPQ (PRPQ) algorithm, by separating a long code into several short codes, re-coding the short codes in different low dimensional subspaces, and then concatenating them to a long code. Experiments on image retrieval tasks demonstrate that RPQ and PRPQ can well capture the main topology structure of data, showing that our algorithm achieves better performance than state-of-the-art methods.
KW - Approximate nearest neighbor search
KW - Binary code
KW - Dimensionality reduction
KW - Hashing
KW - Information retrieval
UR - https://www.scopus.com/pages/publications/85058018272
U2 - 10.1145/3269206.3271780
DO - 10.1145/3269206.3271780
M3 - 会议稿件
AN - SCOPUS:85058018272
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 813
EP - 822
BT - CIKM 2018 - Proceedings of the 27th ACM International Conference on Information and Knowledge Management
A2 - Paton, Norman
A2 - Candan, Selcuk
A2 - Wang, Haixun
A2 - Allan, James
A2 - Agrawal, Rakesh
A2 - Labrinidis, Alexandros
A2 - Cuzzocrea, Alfredo
A2 - Zaki, Mohammed
A2 - Srivastava, Divesh
A2 - Broder, Andrei
A2 - Schuster, Assaf
PB - Association for Computing Machinery
T2 - 27th ACM International Conference on Information and Knowledge Management, CIKM 2018
Y2 - 22 October 2018 through 26 October 2018
ER -