Dynamic Feature Selection for Structural Image Content Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Structural image content recognition (SICR) aims to transcribe a two-dimensional structural image (e.g., mathematical expression, chemical formula, or music score) into a token sequence. Existing methods are mainly encoder-decoder based and overlook the importance of feature selection and spatial relation extraction in the feature map. In this paper, we propose DEAL (shorted for Dynamic fEAture seLection) for SICR, which contains a dynamic feature selector and a spatial relation extractor as two cornerstone modules. Specifically, we propose a novel loss function and random exploration strategy to dynamically select useful image cells for target sequence generation. Further, we consider the positional and surrounding information of cells in the feature map to extract spatial relations. We conduct extensive experiments to evaluate the performance of DEAL. Experimental results show that DEAL outperforms other state-of-the-arts significantly.

Original languageEnglish
Title of host publicationMultiMedia Modeling - 29th International Conference, MMM 2023, Proceedings
EditorsDuc-Tien Dang-Nguyen, Cathal Gurrin, Alan F. Smeaton, Martha Larson, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, Phoebe Chen
PublisherSpringer Science and Business Media Deutschland GmbH
Pages337-349
Number of pages13
ISBN (Print)9783031278174
DOIs
StatePublished - 2023
Event29th International Conference on MultiMedia Modeling, MMM 2023 - Bergen, Norway
Duration: 9 Jan 202312 Jan 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13834 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on MultiMedia Modeling, MMM 2023
Country/TerritoryNorway
CityBergen
Period9/01/2312/01/23

Keywords

  • encoder-decoder network
  • feature selection
  • mathematical expression recognition
  • structural image content recognition

Fingerprint

Dive into the research topics of 'Dynamic Feature Selection for Structural Image Content Recognition'. Together they form a unique fingerprint.

Cite this