AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries

  • Runqi Wang
  • , Huixin Sun
  • , Linlin Yang*
  • , Shaohui Lin
  • , Chuanjian Liu
  • , Yan Gao
  • , Yao Hu
  • , Baochang Zhang
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

7 Scopus citations

Abstract

DEtection TRansformer (DETR) and its variants have achieved remarkable performance. However, they are accompanied by a large computation overhead cost, which significantly prevents their applications on resource-limited devices. Prior arts attempt to reduce the computational burden of DETR using low-bit quantization, while these methods sacrifice a severe significant performance on weight-activation-attention low-bit quantization. We observe that the number of matching queries and positive samples affects much on the representation capacity of queries in DETR, while quantifying queries of DETR further reduces its representational capacity, thus leading to a severe performance drop. We introduce a new quantization strategy based on Auxiliary Queries for DETR (AQ-DETR), aiming to enhance the capacity of quantized queries. In addition, a layer-by-layer distillation is proposed to reduce the quantization error between quantized attention and full-precision counterpart. Through our extensive experiments on large-scale open datasets, the performance of the 4-bit quantization of DETR and Deformable DETR models is comparable to full-precision counterparts.

Original languageEnglish
Pages (from-to)15598-15606
Number of pages9
JournalProceedings of the AAAI Conference on Artificial Intelligence
Volume38
Issue number14
DOIs
StatePublished - 25 Mar 2024
Event38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, Canada
Duration: 20 Feb 202427 Feb 2024

Fingerprint

Dive into the research topics of 'AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries'. Together they form a unique fingerprint.

Cite this