跳到主要导航 跳到搜索 跳到主要内容

CSFwinformer: Cross-Space-Frequency Window Transformer for Mirror Detection

  • Zhifeng Xie
  • , Sen Wang
  • , Qiucheng Yu
  • , Xin Tan*
  • , Yuan Xie
  • *此作品的通讯作者
  • Shanghai University
  • City University of Hong Kong
  • East China Normal University

科研成果: 期刊稿件文章同行评审

摘要

Mirror detection is a challenging task since mirrors do not possess a consistent visual appearance. Even the Segment Anything Model (SAM), which boasts superior zero-shot performance, cannot accurately detect the position of mirrors. Existing methods determine the position of the mirror under hypothetical conditions, such as the correspondence between objects inside and outside the mirror, and the semantic association between the mirror and surrounding objects. However, these assumptions do not apply to all scenarios. For instance, there may be no corresponding real objects to the reflected objects in the scene, or it may be challenging to extract meaningful semantic associations in complex scenes. On the other hand, humans can easily recognize mirrors through the specular texture caused by materials. To mine mirror features in more general scenes, we propose a Cross-Space-Frequency Window Transformer (CSFwinformer) to extract spatial and frequency features for texture analysis. Specifically, we design a Spatial-Frequency Window Alignment module (SFWA) to calculate spatial-frequency feature affinities and learn the difference between mirror and non-mirror textures. We then propose a Dilated Window Attention (DWA) to extract global features to complement the limitation of window alignment. Besides, we propose a Cross-Modality Context Contrast module (CMCC) to fuse cross-modality features and global features, which enables information flow between different windows to take full advantage of cross-modality information. Extensive experiments show that our method performs favorably against state-of-the-art methods on three mirror detection benchmarks and significantly improved SAM performance on mirror detection. The code is available at https://github.com/wangsen99/CSFwinformer.

源语言英语
页(从-至)1853-1867
页数15
期刊IEEE Transactions on Image Processing
33
DOI
出版状态已出版 - 2024

指纹

探究 'CSFwinformer: Cross-Space-Frequency Window Transformer for Mirror Detection' 的科研主题。它们共同构成独一无二的指纹。

引用此