TY - GEN
T1 - An information theoretic approach to sentiment polarity classification
AU - Lin, Yuming
AU - Zhang, Jingwei
AU - Wang, Xiaoling
AU - Zhou, Aoying
PY - 2012
Y1 - 2012
N2 - Sentiment classification is a task of classifying documents according to their overall sentiment inclination. It is very important and popular in many web applications, such as credibility analysis of news sites on the Web, recommendation system and mining online discussion. Vector space model is widely applied on modeling documents in supervised sentiment classification, in which the feature presentation (including features type and weight function) is crucial for classification accuracy. The traditional feature presentation methods of text categorization do not perform well in sentiment classification, because the expressing manners of sentiment are more subtle. We analyze the relationships of terms with sentiment labels based on information theory, and propose a method by applying information theoretic approach on sentiment classification of documents. In this paper, we adopt mutual information on quantifying the sentiment polarities of terms in a document firstly. Then the terms are weighted in vector space based on both sentiment scores and contribution to the document. We perform extensive experiments with SVM on the sets of multiple product reviews, and the experimental results show our approach is more effective than the traditional ones.
AB - Sentiment classification is a task of classifying documents according to their overall sentiment inclination. It is very important and popular in many web applications, such as credibility analysis of news sites on the Web, recommendation system and mining online discussion. Vector space model is widely applied on modeling documents in supervised sentiment classification, in which the feature presentation (including features type and weight function) is crucial for classification accuracy. The traditional feature presentation methods of text categorization do not perform well in sentiment classification, because the expressing manners of sentiment are more subtle. We analyze the relationships of terms with sentiment labels based on information theory, and propose a method by applying information theoretic approach on sentiment classification of documents. In this paper, we adopt mutual information on quantifying the sentiment polarities of terms in a document firstly. Then the terms are weighted in vector space based on both sentiment scores and contribution to the document. We perform extensive experiments with SVM on the sets of multiple product reviews, and the experimental results show our approach is more effective than the traditional ones.
KW - Feature presentation
KW - Information theory
KW - Mutual information
KW - Sentiment classification
UR - https://www.scopus.com/pages/publications/84860577610
U2 - 10.1145/2184305.2184313
DO - 10.1145/2184305.2184313
M3 - 会议稿件
AN - SCOPUS:84860577610
SN - 9781450312370
T3 - ACM International Conference Proceeding Series
SP - 35
EP - 40
BT - WebQuality 2012 - Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
T2 - 2nd Joint WICOW/AIRWeb Workshop on Web Quality, WebQuality 2012
Y2 - 16 April 2012 through 16 April 2012
ER -