TY - GEN
T1 - Prototype vector machine for large scale semi-supervised learning
AU - Zhang, Kai
AU - Kwok, James T.
AU - Parvin, Bahram
PY - 2009
Y1 - 2009
N2 - Practical data mining rarely falls exactly into the supervised learning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computational intensiveness of graph-based SSL arises largely from the manifold or graph regularization, which in turn lead to large models that are difficult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highly scalable, graph-based algorithm for large-scale SSL. Our key innovation is the use of "prototypes vectors" for efficient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.
AB - Practical data mining rarely falls exactly into the supervised learning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computational intensiveness of graph-based SSL arises largely from the manifold or graph regularization, which in turn lead to large models that are difficult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highly scalable, graph-based algorithm for large-scale SSL. Our key innovation is the use of "prototypes vectors" for efficient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.
UR - https://www.scopus.com/pages/publications/70049106797
U2 - 10.1145/1553374.1553531
DO - 10.1145/1553374.1553531
M3 - 会议稿件
AN - SCOPUS:70049106797
SN - 9781605585161
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 26th Annual International Conference on Machine Learning, ICML'09
T2 - 26th Annual International Conference on Machine Learning, ICML'09
Y2 - 14 June 2009 through 18 June 2009
ER -