TY - GEN
T1 - Semantic entity detection by integrating CRF and SVM
AU - Cai, Peng
AU - Luo, Hangzai
AU - Zhou, Aoying
PY - 2010
Y1 - 2010
N2 - Semantic entity detection is very important for extracting and representing the abundant semantic information of multimedia documents. In comparison with other media, e.g. video, image and audio, text expresses semantics more directly and often serves as a bridge in cross-media analysis. However, semantic entity detection from text is still a difficult problem because of the complexity of natural language. In this paper, we propose a novel framework which takes the advantages of both CRF (conditional random fields) and SVM (support vector machines), and present its application to semantic entity detection. Using this framework, context features are represented as the probability of entity boundary and extracted via CRF, and then linguistic and statistical features are extracted via large-scale text document analysis. Finally, all extracted features are integrated and used to perform the classification. As our algorithm systematically integrates the context, linguistic and statistical features, it may outperform traditional algorithms that only adopt part of the features.
AB - Semantic entity detection is very important for extracting and representing the abundant semantic information of multimedia documents. In comparison with other media, e.g. video, image and audio, text expresses semantics more directly and often serves as a bridge in cross-media analysis. However, semantic entity detection from text is still a difficult problem because of the complexity of natural language. In this paper, we propose a novel framework which takes the advantages of both CRF (conditional random fields) and SVM (support vector machines), and present its application to semantic entity detection. Using this framework, context features are represented as the probability of entity boundary and extracted via CRF, and then linguistic and statistical features are extracted via large-scale text document analysis. Finally, all extracted features are integrated and used to perform the classification. As our algorithm systematically integrates the context, linguistic and statistical features, it may outperform traditional algorithms that only adopt part of the features.
UR - https://www.scopus.com/pages/publications/77955021133
U2 - 10.1007/978-3-642-14246-8_47
DO - 10.1007/978-3-642-14246-8_47
M3 - 会议稿件
AN - SCOPUS:77955021133
SN - 3642142451
SN - 9783642142451
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 483
EP - 494
BT - Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings
T2 - 11th International Conference on Web-Age Information Management, WAIM 2010
Y2 - 15 July 2010 through 17 July 2010
ER -