Hierarchical Walking Transformer for Object Re-Identification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Recently, transformer purely based on attention mechanism has been applied to a wide range of tasks and achieved impressive performance. Though extensive efforts have been made, there are still drawbacks to the transformer architecture which hinder its further applications: (i) the quadratic complexity brought by attention mechanism; (ii) barely incorporated inductive bias. In this paper, we present a new hierarchical walking attention, which provides a scalable, flexible, and interpretable sparsification strategy to reduce the complexity from quadratic to linear, and meanwhile evidently boost the performance. Specifically, we learn a hierarchical structure by splitting an image with different receptive fields. We associate each high-level region with a supernode, and inject supervision with prior knowledge in this node. Supernode then acts as an indicator to decide whether this area should be skipped and thereby massive unnecessary dot-product terms in attention can be avoided. Two sparsification phases are finally introduced, allowing the transformer to achieve strictly linear complexity. Extensive experiments are conducted to demonstrate the superior performance and efficiency against state-of-The-Art methods. Significantly, our method sharply reduces the inference time and the total of tokens by 28% and $94%$ respectively, and brings 2.6%@Rank-1 promotion on MSMT17.

Original languageEnglish
Title of host publicationMM 2022 - Proceedings of the 30th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages4224-4232
Number of pages9
ISBN (Electronic)9781450392037
DOIs
StatePublished - 10 Oct 2022
Event30th ACM International Conference on Multimedia, MM 2022 - Lisboa, Portugal
Duration: 10 Oct 202214 Oct 2022

Publication series

NameMM 2022 - Proceedings of the 30th ACM International Conference on Multimedia

Conference

Conference30th ACM International Conference on Multimedia, MM 2022
Country/TerritoryPortugal
CityLisboa
Period10/10/2214/10/22

Keywords

  • hierarchical features
  • linear transformer
  • random walking

Fingerprint

Dive into the research topics of 'Hierarchical Walking Transformer for Object Re-Identification'. Together they form a unique fingerprint.

Cite this