Convolution Neural Network with Active Learning for Information Extraction of Enterprise Announcements

Lei Fu, Zhaoxia Yin, Yi Liu, Jun Zhang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

We propose using convolution neural network (CNN) with active learning for information extraction of enterprise announcements. The training process of supervised deep learning model usually requires a large amount of training data with high-quality reference samples. Human production of such samples is tedious, and since inter-labeler agreement is low, very unreliable. Active learning helps assuage this problem by automatically selecting a small amount of unlabeled samples for humans to hand correct. Active learning chooses a selective set of samples to be labeled. Then the CNN is trained on the labeled data iteratively, until the expected experimental effect is achieved. We propose three sample selection methods based on certainty criterion. We also establish an enterprise announcements dataset for experiments, which contains 10410 samples totally. Our experiment results show that the amount of labeled data needed for a given extraction accuracy can be reduced by more than 45.79% compared to that without active learning.

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 7th CCF International Conference, NLPCC 2018, Proceedings
EditorsVincent Ng, Dongyan Zhao, Sujian Li, Hongying Zan, Min Zhang
PublisherSpringer Verlag
Pages330-339
Number of pages10
ISBN (Print)9783319995007
DOIs
StatePublished - 2018
Externally publishedYes
Event7th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2018 - Hohhot, China
Duration: 26 Aug 201830 Aug 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11109 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2018
Country/TerritoryChina
CityHohhot
Period26/08/1830/08/18

Keywords

  • Active learning
  • Convolutional neural networks
  • Enterprise announcements
  • Text classification

Fingerprint

Dive into the research topics of 'Convolution Neural Network with Active Learning for Information Extraction of Enterprise Announcements'. Together they form a unique fingerprint.

Cite this