Learning bi-grained cross-correlation siamese networks for visual tracking

  • Defang Zhao
  • , Chao Ma
  • , Dandan Zhu*
  • , Jia Shuai
  • , Jianwei Lu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Siamese network based trackers measure the similarity between a target template and a search region by computing their cross-correlation. Specifically, Siamese trackers regard the target template as a spatial filter to convolve the search region, putting emphasis on the coarse-grained semantic abstraction of the target in the spatial domain. Along with the demonstrated success of Siamese trackers, little attention has been paid to fine-grained spatial details in cross-correlation computation, which is crucial to precise target localization. In this paper, we propose to learn point-wise cross-correlation Siamese networks for visual tracking. By sketching the contour of the target, the proposed point-wise cross-correlation module helps Siamese networks to be aware of the distinctive boundary between the target and background. In conjunction with traditional depth-wise cross-correlation, the proposed Siamese network takes both advantages of coarse-grained semantic abstraction and fine-grained details to precisely locate the target. Extensive experiments demonstrate the effectiveness and efficiency of the proposed tracker, which achieves new state-of-the-art results on five visual tracking benchmarks including VOT2016, VOT2018, VOT2019, OTB100, and LaSOT with the speed of 38 FPS. As an extra benefit, our tracker can output the segmentation mask for the target. We demonstrate the favorable performance of our tracker on the video object segmentation datasets in comparison with the state-of-the-art.

Original languageEnglish
Pages (from-to)12175-12190
Number of pages16
JournalApplied Intelligence
Volume52
Issue number11
DOIs
StatePublished - Sep 2022
Externally publishedYes

Keywords

  • Bi-grained
  • Contour
  • Depth-wise
  • Point-wise
  • Siamese network

Fingerprint

Dive into the research topics of 'Learning bi-grained cross-correlation siamese networks for visual tracking'. Together they form a unique fingerprint.

Cite this