TY - GEN
T1 - Quantize Sequential Recommenders Without Private Data
AU - Shi, Lingfeng
AU - Liu, Yuang
AU - Wang, Jun
AU - Zhang, Wei
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/4/30
Y1 - 2023/4/30
N2 - Deep neural networks have achieved great success in sequential recommendation systems. While maintaining high competence in user modeling and next-item recommendation, these models have long been plagued by the numerous parameters and computation, which inhibit them to be deployed on resource-constrained mobile devices. Model quantization, as one of the main paradigms for compression techniques, converts float parameters to low-bit values to reduce parameter redundancy and accelerate inference. To avoid drastic performance degradation, it usually requests a fine-tuning phase with an original dataset. However, the training set of user-item interactions is not always available due to transmission limits or privacy concerns. In this paper, we propose a novel framework to quantize sequential recommenders without access to any real private data. A generator is employed in the framework to synthesize fake sequence samples to feed the quantized sequential recommendation model and minimize the gap with a full-precision sequential recommendation model. The generator and the quantized model are optimized with a min-max game - alternating discrepancy estimation and knowledge transfer. Moreover, we devise a two-level discrepancy modeling strategy to transfer information between the quantized model and the full-precision model. The extensive experiments of various recommendation networks on three public datasets demonstrate the effectiveness of the proposed framework.
AB - Deep neural networks have achieved great success in sequential recommendation systems. While maintaining high competence in user modeling and next-item recommendation, these models have long been plagued by the numerous parameters and computation, which inhibit them to be deployed on resource-constrained mobile devices. Model quantization, as one of the main paradigms for compression techniques, converts float parameters to low-bit values to reduce parameter redundancy and accelerate inference. To avoid drastic performance degradation, it usually requests a fine-tuning phase with an original dataset. However, the training set of user-item interactions is not always available due to transmission limits or privacy concerns. In this paper, we propose a novel framework to quantize sequential recommenders without access to any real private data. A generator is employed in the framework to synthesize fake sequence samples to feed the quantized sequential recommendation model and minimize the gap with a full-precision sequential recommendation model. The generator and the quantized model are optimized with a min-max game - alternating discrepancy estimation and knowledge transfer. Moreover, we devise a two-level discrepancy modeling strategy to transfer information between the quantized model and the full-precision model. The extensive experiments of various recommendation networks on three public datasets demonstrate the effectiveness of the proposed framework.
KW - Data-free Quantization
KW - Model Compression
KW - Sequential Recommenders
UR - https://www.scopus.com/pages/publications/85159267880
U2 - 10.1145/3543507.3583351
DO - 10.1145/3543507.3583351
M3 - 会议稿件
AN - SCOPUS:85159267880
T3 - ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023
SP - 1043
EP - 1052
BT - ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023
PB - Association for Computing Machinery, Inc
T2 - 32nd ACM World Wide Web Conference, WWW 2023
Y2 - 30 April 2023 through 4 May 2023
ER -