Distributed Estimation of Principal Support Vector Machines for Sufficient Dimension Reduction

  • Jun Jin
  • , Chao Ying
  • , Zhou Yu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The principal support vector machines method is a powerful tool for sufficient dimension reduction that replaces original predictors with their low-dimensional linear combinations while preserving the information for regression and classification. However, the computational burden of the principal support vector machines method constrains its use for massive data. To address this issue, we propose a naive and a refined distributed estimation algorithms for fast implementation when the sample size is large. Both distributed sufficient dimension reduction estimators exhibit the same statistical efficiency as when all the data is merged together, which provides rigorous statistical guarantees for their application to large-scale datasets, while the refined method requires smaller batch sample sizes and hence is more advantageous when memory limitations exist on distributed machines. The two distributed algorithms are further adapted to principal weighted support vector machines for sufficient dimension reduction in binary classification. The statistical accuracy and computational complexity of our proposed methods are examined through comprehensive simulation studies and in a real data application with more than 600,000 samples.

Original languageEnglish
Pages (from-to)254-266
Number of pages13
JournalTechnometrics
Volume67
Issue number2
DOIs
StatePublished - 2025

Keywords

  • Distributed estimation
  • Principal support vector machine
  • Sliced inverse regression
  • Sufficient dimension reduction

Fingerprint

Dive into the research topics of 'Distributed Estimation of Principal Support Vector Machines for Sufficient Dimension Reduction'. Together they form a unique fingerprint.

Cite this