Prototype vector machine for large scale semi-supervised learning

Kai Zhang, James T. Kwok, Bahram Parvin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations

Abstract

Practical data mining rarely falls exactly into the supervised learning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computational intensiveness of graph-based SSL arises largely from the manifold or graph regularization, which in turn lead to large models that are difficult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highly scalable, graph-based algorithm for large-scale SSL. Our key innovation is the use of "prototypes vectors" for efficient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.

Original languageEnglish
Title of host publicationProceedings of the 26th Annual International Conference on Machine Learning, ICML'09
DOIs
StatePublished - 2009
Externally publishedYes
Event26th Annual International Conference on Machine Learning, ICML'09 - Montreal, QC, Canada
Duration: 14 Jun 200918 Jun 2009

Publication series

NameACM International Conference Proceeding Series
Volume382

Conference

Conference26th Annual International Conference on Machine Learning, ICML'09
Country/TerritoryCanada
CityMontreal, QC
Period14/06/0918/06/09

Fingerprint

Dive into the research topics of 'Prototype vector machine for large scale semi-supervised learning'. Together they form a unique fingerprint.

Cite this