跳到主要导航 跳到搜索 跳到主要内容

A New Fourier-Attention Guided Approach for Domain-Agnostic Text Localization

  • Arnab Halder
  • , Shivakumara Palaiahnakote*
  • , Umapada Pal
  • , Michael Blumenstein
  • , Yue Lu
  • *此作品的通讯作者
  • University of Technology Sydney
  • Indian Statistical Institute
  • University of Salford

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Text detection in images of adverse situations like underwater images and open day and night environments, where one can expect the effect of shaky and non-shaky cameras, is challenging. This work aims to develop a new model that can cope with the challenges of different domains, namely, underwater images, shaky and non-shaky images, and normal scene images for text detection. The approach leverages the Fourier attention and kernels to enhance feature extraction, focusing on high-frequency components associated with text edges. These features are fed to dual-stream corner detection by employing vertical and horizontal pooling for robust text detection. Additionally, we introduce a cross-star deformable convolution layer, guided by Fourier-derived information, which dynamically adapts its receptive field to achieve precise bounding box localization. Bounding box predictions are iteratively refined using heatmaps and offset adjustments. Overall, by integrating frequency-domain analysis with spatially adaptive convolutional operations, our method excels across diverse text detection scenarios without requiring domain-specific adaptations. The performance of the proposed method is demonstrated by testing on three different datasets: underwater, shaky and non-shaky images, and normal natural scene images. The results show that the proposed method achieves state-of-the-art performance compared to the existing methods.

源语言英语
主期刊名Document Analysis and Recognition – ICDAR 2025 - 19th International Conference, Proceedings
编辑Xu-Cheng Yin, Dimosthenis Karatzas, Daniel Lopresti
出版商Springer Science and Business Media Deutschland GmbH
180-199
页数20
ISBN(印刷版)9783032046239
DOI
出版状态已出版 - 2026
活动19th International Conference on Document Analysis and Recognition, ICDAR 2025 - Wuhan, 中国
期限: 16 9月 202521 9月 2025

出版系列

姓名Lecture Notes in Computer Science
16025 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议19th International Conference on Document Analysis and Recognition, ICDAR 2025
国家/地区中国
Wuhan
时期16/09/2521/09/25

指纹

探究 'A New Fourier-Attention Guided Approach for Domain-Agnostic Text Localization' 的科研主题。它们共同构成独一无二的指纹。

引用此