跳到主要导航 跳到搜索 跳到主要内容

A comprehensive comparative study on term weighting schemes for text categorization with support vector machines

  • Man Lan*
  • , Chew Lim Tan
  • , Hwee Boon Low
  • , Sam Yuan Sung
  • *此作品的通讯作者
  • Agency for Science, Technology and Research, Singapore
  • National University of Singapore

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Term weighting scheme, which has been used to convert the documents as vectors in the term space, is a vital step in automatic text categorization. In this paper, we conducted comprehensive experiments to compare various term weighting schemes with SVM on two widely-used benchmark data sets. We also presented a new term weighting scheme tf-rf to improve the term's discriminating power. The controlled experimental results showed that this newly proposed tf-rf scheme is significantly better than other widely-used term weighting schemes. Compared with schemes related with tf factor alone, the idf factor does not improve or even decrease the term's discriminating power for text categorization.

源语言英语
主期刊名14th International World Wide Web Conference, WWW2005
1032-1033
页数2
DOI
出版状态已出版 - 2005
已对外发布
活动14th International World Wide Web Conference, WWW2005 - Chiba, 日本
期限: 10 5月 200514 5月 2005

出版系列

姓名14th International World Wide Web Conference, WWW2005

会议

会议14th International World Wide Web Conference, WWW2005
国家/地区日本
Chiba
时期10/05/0514/05/05

指纹

探究 'A comprehensive comparative study on term weighting schemes for text categorization with support vector machines' 的科研主题。它们共同构成独一无二的指纹。

引用此