Semi-supervised learning for various comparison functions across two populations

Menghua Zhang, Mengjiao Peng, Yong Zhou

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Estimating comparison functions is crucial in numerous domains, such as econometrics, clinical medicine, and public health, where evaluating the effectiveness of interventions or treatment effects is a central concern. While the response variables are much more expensive to collect than the covariates in many scenarios, to tackle the challenge of limited labeled data, we present a unified semi-supervised learning (SSL) framework to estimate comparison functions, like the difference between two independent samples in means, probabilities for events, the survival competition probability, by leveraging the information of unlabelled data with only covariate observations to improve estimation accuracy. Specifically, a class of efficient and adaptive estimators for comparison functions is proposed to effectively utilize both the labeled data and unlabelled data under the semi-supervised (SS) framework. We establish the consistency and asymptotic normality of the proposed estimators and provide the optimal weight yielding the most efficient estimator. Furthermore, the resulting estimator is shown to be semiparametric efficient if the working model is correctly specified. Extensive numerical simulations are conducted to confirm the consistency and efficiency of our proposed estimators. An application to a real data extracted from the 2001 Medical Expenditures Panel Survey (MEPS) is also included.

Original languageEnglish
Article number18
JournalStatistical Papers
Volume66
Issue number1
DOIs
StatePublished - Feb 2025

Keywords

  • Adaptivity
  • Comparison functions
  • Model-free estimator
  • Semi-supervised learning
  • Semiparametric efficiency

Fingerprint

Dive into the research topics of 'Semi-supervised learning for various comparison functions across two populations'. Together they form a unique fingerprint.

Cite this