ECNU at SemEval-2016 task 1: Leveraging word embedding from macro and micro views to boost performance for semantic textual similarity

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

This paper presents our submissions for semantic textual similarity task in SemEval 2016. Based on several traditional features (i.e., string-based, corpus-based, machine translation similarity and alignment metrics), we leverage word embedding from macro (i.e., first get representation of sentence, then measure the similarity of sentence pair) and micro views (i.e., measure the similarity of word pairs separately) to boost performance. Due to the various domains of training data and test data, we adopt three different strategies: 1) U-SEVEN: an unsupervised model, which utilizes seven straight-forward metrics; 2) S1-All: using all available dataset-s; 3) S2: selecting the most similar training sets for each test set. Results on test sets show that the unified supervised model (i.e., S1-All) achieves the best averaged performance with a mean correlation of 75.07%.

Original languageEnglish
Title of host publicationSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages621-627
Number of pages7
ISBN (Electronic)9781941643952
DOIs
StatePublished - 2016
Event10th International Workshop on Semantic Evaluation, SemEval 2016 co-located with the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2016 - San Diego, United States
Duration: 16 Jun 201617 Jun 2016

Publication series

NameSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings

Conference

Conference10th International Workshop on Semantic Evaluation, SemEval 2016 co-located with the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2016
Country/TerritoryUnited States
CitySan Diego
Period16/06/1617/06/16

Fingerprint

Dive into the research topics of 'ECNU at SemEval-2016 task 1: Leveraging word embedding from macro and micro views to boost performance for semantic textual similarity'. Together they form a unique fingerprint.

Cite this