TextRSR: Enhanced Arbitrary-Shaped Scene Text Representation Via Robust Subspace Recovery

  • Zhiwen Shao
  • , Shengtian Jiang*
  • , Hancheng Zhu
  • , Xuehuai Shi
  • , Canlin Li
  • , Lizhuang Ma
  • , Dit Yan Yeung
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In recent years, scene text detection research has increasingly focused on arbitrary-shaped texts, where text representation is a fundamental problem. However, most existing methods still struggle to separate adjacent or overlapping texts due to ambiguous spatial positions of points or segmentation masks. Besides, the time efficiency of the entire pipeline is often neglected, resulting in sub-optimal inference speed. To tackle these problems, we first propose a novel text representation method based on robust subspace recovery, which robustly represents complex text shapes by combining orthogonal basis vectors learned from labeled text contours. These basis vectors capture basis contour patterns with distinct information, enabling clearer boundaries even in densely populated text scenarios. Moreover, we propose a dynamic sparse assignment scheme for positive samples that adaptively adjusts their weights during training, which not only accelerates inference speed by eliminating redundant predictions but also enhances feature learning by providing sufficient supervision signals. Building on these innovations, we present TextRSR, an accurate and efficient scene text detection network. Extensive experiments on challenging benchmarks demonstrate the superior accuracy and efficiency of TextRSR compared to state-of-the-art methods. Particularly, TextRSR achieves an F-measure of 88.5% at 37.8 frames per second (FPS) for CTW1500 dataset and an F-measure of 89.1% at 23.1 FPS for Total-Text dataset.

Original languageEnglish
JournalIEEE Transactions on Multimedia
DOIs
StateAccepted/In press - 2026
Externally publishedYes

Keywords

  • arbitrary-shaped text representation
  • dynamic sparse assignment
  • robust subspace recovery
  • Scene text detection

Fingerprint

Dive into the research topics of 'TextRSR: Enhanced Arbitrary-Shaped Scene Text Representation Via Robust Subspace Recovery'. Together they form a unique fingerprint.

Cite this