Triple Feature Disentanglement for One-Stage Adaptive Object Detection

  • Haoan Wang
  • , Shilong Jia
  • , Tieyong Zeng
  • , Guixu Zhang
  • , Zhi Li*
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

7 Scopus citations

Abstract

In recent advancements concerning Domain Adaptive Object Detection (DAOD), unsupervised domain adaptation techniques have proven instrumental. These methods enable enhanced detection capabilities within unlabeled target domains by mitigating distribution differences between source and target domains. A subset of DAOD methods employs disentangled learning to segregate Domain-Specific Representations (DSR) and Domain-Invariant Representations (DIR), with ultimate predictions relying on the latter. Current practices in disentanglement, however, often lead to DIR containing residual domain-specific information. To address this, we introduce the Multi-level Disentanglement Module (MDM) that progressively disentangles DIR, enhancing comprehensive disentanglement. Additionally, our proposed Cyclic Disentanglement Module (CDM) facilitates DSR separation. To refine the process further, we employ the Categorical Features Disentanglement Module (CFDM) to isolate DIR and DSR, coupled with category alignment across scales for improved source-target domain alignment. Given its practical suitability, our model is constructed upon the foundational framework of the Single Shot MultiBox Detector (SSD), which is a one-stage object detection approach. Experimental validation highlights the effectiveness of our method, demonstrating its state-of-the-art performance across three benchmark datasets.

Original languageEnglish
Pages (from-to)5401-5409
Number of pages9
JournalProceedings of the AAAI Conference on Artificial Intelligence
Volume38
Issue number6
DOIs
StatePublished - 25 Mar 2024
Event38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, Canada
Duration: 20 Feb 202427 Feb 2024

Fingerprint

Dive into the research topics of 'Triple Feature Disentanglement for One-Stage Adaptive Object Detection'. Together they form a unique fingerprint.

Cite this