TY - GEN
T1 - Aspect extraction from product reviews using category hierarchy information
AU - Yinfei, Yang
AU - Chen, Cen
AU - Qiu, Minghui
AU - Bao, Forrest Sheng
N1 - Publisher Copyright:
© 2017 Association for Computational Linguistics.
PY - 2017
Y1 - 2017
N2 - Aspect extraction is a task to abstract the common properties of objects from corpora discussing them, such as reviews of products. Recent work on aspect extraction is leveraging the hierarchical relationship between products and their categories. However, such effort focuses on the aspects of child categories but ignores those from parent categories. Hence, we propose an LDA-based generative topic model inducing the two-layer categorical information (CAT-LDA), to balance the aspects of both a parent category and its child categories. Our hypothesis is that child categories inherit aspects from parent categories, controlled by the hierarchy between them. Experimental results on 5 categories of Amazon.com products show that both common aspects of parent category and the individual aspects of subcategories can be extracted to align well with the common sense. We further evaluate the manually extracted aspects of 16 products, resulting in an average hit rate of 79.10%.
AB - Aspect extraction is a task to abstract the common properties of objects from corpora discussing them, such as reviews of products. Recent work on aspect extraction is leveraging the hierarchical relationship between products and their categories. However, such effort focuses on the aspects of child categories but ignores those from parent categories. Hence, we propose an LDA-based generative topic model inducing the two-layer categorical information (CAT-LDA), to balance the aspects of both a parent category and its child categories. Our hypothesis is that child categories inherit aspects from parent categories, controlled by the hierarchy between them. Experimental results on 5 categories of Amazon.com products show that both common aspects of parent category and the individual aspects of subcategories can be extracted to align well with the common sense. We further evaluate the manually extracted aspects of 16 products, resulting in an average hit rate of 79.10%.
UR - https://www.scopus.com/pages/publications/85021756678
U2 - 10.18653/v1/e17-2107
DO - 10.18653/v1/e17-2107
M3 - 会议稿件
AN - SCOPUS:85021756678
T3 - 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference
SP - 675
EP - 680
BT - Short Papers
PB - Association for Computational Linguistics (ACL)
T2 - 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017
Y2 - 3 April 2017 through 7 April 2017
ER -