A New Fourier-Attention Guided Approach for Domain-Agnostic Text Localization

  • Arnab Halder
  • , Shivakumara Palaiahnakote*
  • , Umapada Pal
  • , Michael Blumenstein
  • , Yue Lu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Text detection in images of adverse situations like underwater images and open day and night environments, where one can expect the effect of shaky and non-shaky cameras, is challenging. This work aims to develop a new model that can cope with the challenges of different domains, namely, underwater images, shaky and non-shaky images, and normal scene images for text detection. The approach leverages the Fourier attention and kernels to enhance feature extraction, focusing on high-frequency components associated with text edges. These features are fed to dual-stream corner detection by employing vertical and horizontal pooling for robust text detection. Additionally, we introduce a cross-star deformable convolution layer, guided by Fourier-derived information, which dynamically adapts its receptive field to achieve precise bounding box localization. Bounding box predictions are iteratively refined using heatmaps and offset adjustments. Overall, by integrating frequency-domain analysis with spatially adaptive convolutional operations, our method excels across diverse text detection scenarios without requiring domain-specific adaptations. The performance of the proposed method is demonstrated by testing on three different datasets: underwater, shaky and non-shaky images, and normal natural scene images. The results show that the proposed method achieves state-of-the-art performance compared to the existing methods.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition – ICDAR 2025 - 19th International Conference, Proceedings
EditorsXu-Cheng Yin, Dimosthenis Karatzas, Daniel Lopresti
PublisherSpringer Science and Business Media Deutschland GmbH
Pages180-199
Number of pages20
ISBN (Print)9783032046239
DOIs
StatePublished - 2026
Event19th International Conference on Document Analysis and Recognition, ICDAR 2025 - Wuhan, China
Duration: 16 Sep 202521 Sep 2025

Publication series

NameLecture Notes in Computer Science
Volume16025 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Document Analysis and Recognition, ICDAR 2025
Country/TerritoryChina
CityWuhan
Period16/09/2521/09/25

Keywords

  • Fourier Attention
  • Shaky-Non-Shaky Text
  • Text Detection
  • Underwater Text

Fingerprint

Dive into the research topics of 'A New Fourier-Attention Guided Approach for Domain-Agnostic Text Localization'. Together they form a unique fingerprint.

Cite this