TY - GEN
T1 - ECNU at SemEval-2016 task 1
T2 - 10th International Workshop on Semantic Evaluation, SemEval 2016 co-located with the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2016
AU - Tian, Junfeng
AU - Lan, Man
N1 - Publisher Copyright:
© 2016 Association for Computational Linguistics.
PY - 2016
Y1 - 2016
N2 - This paper presents our submissions for semantic textual similarity task in SemEval 2016. Based on several traditional features (i.e., string-based, corpus-based, machine translation similarity and alignment metrics), we leverage word embedding from macro (i.e., first get representation of sentence, then measure the similarity of sentence pair) and micro views (i.e., measure the similarity of word pairs separately) to boost performance. Due to the various domains of training data and test data, we adopt three different strategies: 1) U-SEVEN: an unsupervised model, which utilizes seven straight-forward metrics; 2) S1-All: using all available dataset-s; 3) S2: selecting the most similar training sets for each test set. Results on test sets show that the unified supervised model (i.e., S1-All) achieves the best averaged performance with a mean correlation of 75.07%.
AB - This paper presents our submissions for semantic textual similarity task in SemEval 2016. Based on several traditional features (i.e., string-based, corpus-based, machine translation similarity and alignment metrics), we leverage word embedding from macro (i.e., first get representation of sentence, then measure the similarity of sentence pair) and micro views (i.e., measure the similarity of word pairs separately) to boost performance. Due to the various domains of training data and test data, we adopt three different strategies: 1) U-SEVEN: an unsupervised model, which utilizes seven straight-forward metrics; 2) S1-All: using all available dataset-s; 3) S2: selecting the most similar training sets for each test set. Results on test sets show that the unified supervised model (i.e., S1-All) achieves the best averaged performance with a mean correlation of 75.07%.
UR - https://www.scopus.com/pages/publications/85035757196
U2 - 10.18653/v1/s16-1094
DO - 10.18653/v1/s16-1094
M3 - 会议稿件
AN - SCOPUS:85035757196
T3 - SemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings
SP - 621
EP - 627
BT - SemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings
PB - Association for Computational Linguistics (ACL)
Y2 - 16 June 2016 through 17 June 2016
ER -