ScriptNet: A Two Stream CNN for Script Identification in Camera-Based Document Images

  • Minzhen Deng
  • , Hui Ma
  • , Li Liu*
  • , Taorong Qiu
  • , Yue Lu
  • , Ching Y. Suen
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Script identification is an essential part of a document image analysis system, since documents written in different scripts may undergo different processing methods. In this paper, we address the issue of script identification in camera-based document images, which is challenging since the camera-based document images are often subject to perspective distortions, uneven illuminations, etc. We propose a novel network called ScriptNet that is composed of two streams: spatial stream and visual stream. The spatial stream captures the spatial dependencies within the image, while the visual stream describes the appearance of the image. The two streams are then fused in the network, which can be trained in an end-to-end manner. Extensive experiments demonstrate the effectiveness of the proposed approach. The two streams have been shown to be complementary to each other. An accuracy of 99.1 % has been achieved by our proposed network, which compares favourably with state-of-the-art methods. Besides, the proposed network achieves promising results even when it is trained with non-camera-based document images and tested on camera-based document images.

Original languageEnglish
Title of host publicationNeural Information Processing - 29th International Conference, ICONIP 2022, Proceedings
EditorsMohammad Tanveer, Sonali Agarwal, Seiichi Ozawa, Asif Ekbal, Adam Jatowt
PublisherSpringer Science and Business Media Deutschland GmbH
Pages14-25
Number of pages12
ISBN (Print)9789819916443
DOIs
StatePublished - 2023
Event29th International Conference on Neural Information Processing, ICONIP 2022 - Virtual, Online
Duration: 22 Nov 202226 Nov 2022

Publication series

NameCommunications in Computer and Information Science
Volume1793 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference29th International Conference on Neural Information Processing, ICONIP 2022
CityVirtual, Online
Period22/11/2226/11/22

Keywords

  • Script identification
  • ScriptNet
  • camera-based document images
  • spatial stream
  • visual stream

Fingerprint

Dive into the research topics of 'ScriptNet: A Two Stream CNN for Script Identification in Camera-Based Document Images'. Together they form a unique fingerprint.

Cite this