跳到主要导航 跳到搜索 跳到主要内容

Bayesian performance comparison of text classifiers

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

How can we know whether one classifier is really better than the other? In the area of text classification, since the publication of Yang and Liu's seminal SIGIR-1999 paper, it has become a standard practice for researchers to apply nullhypothesis significance testing (NHST) on their experimental results in order to establish the superiority of a classifier. However, such a frequentist approach has a number of inherent deficiencies and limitations, e.g., the inability to accept the null hypothesis (that the two classifiers perform equally well), the difficulty to compare commonly-used multivariate performance measures like F1 scores instead of accuracy, and so on. In this paper, we propose a novel Bayesian approach to the performance comparison of text classifiers, and argue its advantages over the traditional frequentist approach based on t-test etc. In contrast to the existing probabilistic model for F1 scores which is unpaired, our proposed model takes the correlation between classifiers into account and thus achieves greater statistical power. Using several typical text classification algorithms and a benchmark dataset, we demonstrate that the our approach provides rich information about the difference between two classifiers' performances.

源语言英语
主期刊名SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval
出版商Association for Computing Machinery, Inc
15-24
页数10
ISBN(电子版)9781450342902
DOI
出版状态已出版 - 7 7月 2016
活动39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016 - Pisa, 意大利
期限: 17 7月 201621 7月 2016

出版系列

姓名SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval

会议

会议39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016
国家/地区意大利
Pisa
时期17/07/1621/07/16

指纹

探究 'Bayesian performance comparison of text classifiers' 的科研主题。它们共同构成独一无二的指纹。

引用此