TY - GEN
T1 - ECNU
T2 - 8th International Workshop on Semantic Evaluation, SemEval 2014 - co-located with the 25th International Conference on Computational Linguistics, COLING 2014
AU - Zhu, Tian Tian
AU - Lan, Man
N1 - Publisher Copyright:
© 8th International Workshop on Semantic Evaluation, SemEval 2014 - co-located with the 25th International Conference on Computational Linguistics, COLING 2014, Proceedings. All rights reserved.
PY - 2014
Y1 - 2014
N2 - This paper reports our submissions to the Cross Level Semantic Similarity (CLSS) task in SemEval 2014. We submitted one Random Forest regression system on each cross level text pair, i.e., Paragraph to Sentence (P-S), Sentence to Phrase (S-Ph), Phrase to Word (Ph-W) and Word to Sense (W-Se). For text pairs on P-S level and S-Ph level, we consider them as sentences and extract heterogeneous types of similarity features, i.e., string features, knowledge based features, corpus based features, syntactic features, machine translation based features, multi-level text features, etc. For text pairs on Ph-W level and W-Se level, due to lack of information, most of these features are not applicable or available. To overcome this problem, we propose several information enrichment methods using WordNet synonym and definition. Our systems rank the 2nd out of 18 teams both on Pearson correlation (official rank) and Spearman rank correlation. Specifically, our systems take the second place on P-S level, S-Ph level and Ph-W level and the 4th place on W-Se level in terms of Pearson correlation.
AB - This paper reports our submissions to the Cross Level Semantic Similarity (CLSS) task in SemEval 2014. We submitted one Random Forest regression system on each cross level text pair, i.e., Paragraph to Sentence (P-S), Sentence to Phrase (S-Ph), Phrase to Word (Ph-W) and Word to Sense (W-Se). For text pairs on P-S level and S-Ph level, we consider them as sentences and extract heterogeneous types of similarity features, i.e., string features, knowledge based features, corpus based features, syntactic features, machine translation based features, multi-level text features, etc. For text pairs on Ph-W level and W-Se level, due to lack of information, most of these features are not applicable or available. To overcome this problem, we propose several information enrichment methods using WordNet synonym and definition. Our systems rank the 2nd out of 18 teams both on Pearson correlation (official rank) and Spearman rank correlation. Specifically, our systems take the second place on P-S level, S-Ph level and Ph-W level and the 4th place on W-Se level in terms of Pearson correlation.
UR - https://www.scopus.com/pages/publications/85104404828
U2 - 10.3115/v1/s14-2043
DO - 10.3115/v1/s14-2043
M3 - 会议稿件
AN - SCOPUS:85104404828
T3 - 8th International Workshop on Semantic Evaluation, SemEval 2014 - co-located with the 25th International Conference on Computational Linguistics, COLING 2014, Proceedings
SP - 265
EP - 270
BT - 8th International Workshop on Semantic Evaluation, SemEval 2014 - co-located with the 25th International Conference on Computational Linguistics, COLING 2014, Proceedings
A2 - Nakov, Preslav
A2 - Zesch, Torsten
PB - Association for Computational Linguistics (ACL)
Y2 - 23 August 2014 through 24 August 2014
ER -