IRMP: From printed forms to relational data model

  • Jun Zhou
  • , Han Yu
  • , Cheng Xie
  • , Hongming Cai
  • , Lihong Jiang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Massive printed forms are inevitably existing in daily business processes, which makes it di cult for computers to deal with. Thus, there is an emerging requirement to automatically convert these print-outs into computer understandable data, stored as structured data models for further applications. To cater to this need, we rst extract table lines and texts from printed forms and convert them into RDF models. Then the heterogeneous models extracted from di erent instances are connected based on string and lexical similarity. Finally according to the mapping rules we automatically convert the connected models into the relational data model, which builds the foundation for subsequent use such as database generation and linked data interconnection. Multiple experiments using real resumes as dataset as well as a case study are conducted to verify the framework. And we construct a prototype system, iRMP(intelligent Resource Management Platform), to demonstrate the practicability and e ectiveness of the approach.

Original languageEnglish
Title of host publicationProceedings - 18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016
EditorsLaurence T. Yang, Jinjun Chen
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1394-1401
Number of pages8
ISBN (Electronic)9781509042968
DOIs
StatePublished - 20 Jan 2017
Externally publishedYes
Event18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016 - Sydney, Australia
Duration: 12 Dec 201614 Dec 2016

Publication series

NameProceedings - 18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016

Conference

Conference18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016
Country/TerritoryAustralia
CitySydney
Period12/12/1614/12/16

Keywords

  • Data extraction
  • Form recognition
  • Model connection
  • Relational data model

Fingerprint

Dive into the research topics of 'IRMP: From printed forms to relational data model'. Together they form a unique fingerprint.

Cite this