A General M-estimation Theory in Semi-Supervised Framework

Shanshan Song, Yuanyuan Lin, Yong Zhou

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

We study a class of general M-estimators in the semi-supervised setting, wherein the data are typically a combination of a relatively small labeled dataset and large amounts of unlabeled data. A new estimator, which efficiently uses the useful information contained in the unlabeled data, is proposed via a projection technique. We prove consistency and asymptotic normality, and provide an inference procedure based on (Formula presented.) -fold cross-validation. The optimal weights are derived to balance the contributions of the labeled and unlabeled data. It is shown that the proposed method, by taking advantage of the unlabeled data, produces asymptotically more efficient estimation of the target parameters than the supervised counterpart. Supportive numerical evidence is shown in simulation studies. Applications are illustrated in analysis of the homeless data in Los Angeles. Supplementary materials for this article are available online.

Original languageEnglish
Pages (from-to)1065-1075
Number of pages11
JournalJournal of the American Statistical Association
Volume119
Issue number546
DOIs
StatePublished - 2024

Keywords

  • Projection method
  • Semi-supervised inference
  • Weighted loss function

Fingerprint

Dive into the research topics of 'A General M-estimation Theory in Semi-Supervised Framework'. Together they form a unique fingerprint.

Cite this