Not all pixels are matched: Dense contrastive learning for cross-modality person re-identification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

83 Scopus citations

Abstract

Visible-Infrared Person Re-Identification (VI-ReID) has become an emerging task for night-Time surveillance systems. In order to reduce the cross-modality discrepancy, previous works either align the features via metric learning or generate synthesized cross-modality images by Generative Adversary Network. However, feature-level alignment ignores the heterogeneous data itself while generative framework suffers from the low generation quality, limiting their applications. In this paper, we propose a dense contrastive learning framework (DCLNet), which performs pixel-To-pixel dense alignment acting on the intermediate representations, rather than the final deep feature. It is a new loss function that brings views of positive pixels with same semantic information closer in shallow representation space, whilst pushing views of negative pixels apart. It naturally provides additional dense supervision and captures fine-grained pixel correspondence, reducing the modality gap from a new perspective. To implement it, a Part Aware Parsing (PAP) module and a Semantic Rectification Module (SRM) are introduced to learn and refine a semantic-guided mask, allowing us to efficiently find positive pairs only requiring instance-level supervision. Extensive experiments on the public SYSU-MM01 and RegDB datasets demonstrate the superiority of our pipeline over state-of-The-Arts. Code is available at https://github.com/sunhz0117/DCLNet.

Original languageEnglish
Title of host publicationMM 2022 - Proceedings of the 30th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages5333-5341
Number of pages9
ISBN (Electronic)9781450392037
DOIs
StatePublished - 10 Oct 2022
Event30th ACM International Conference on Multimedia, MM 2022 - Lisboa, Portugal
Duration: 10 Oct 202214 Oct 2022

Publication series

NameMM 2022 - Proceedings of the 30th ACM International Conference on Multimedia

Conference

Conference30th ACM International Conference on Multimedia, MM 2022
Country/TerritoryPortugal
CityLisboa
Period10/10/2214/10/22

Keywords

  • cross-modality alignment
  • dense contrastive learning
  • visible-infrared person re-identification

Fingerprint

Dive into the research topics of 'Not all pixels are matched: Dense contrastive learning for cross-modality person re-identification'. Together they form a unique fingerprint.

Cite this