Model Averaging for Prediction With Fragmentary Data

Fang Fang, Wei Lan, Jingjing Tong, Jun Shao

Research output: Contribution to journalArticlepeer-review

33 Scopus citations

Abstract

One main challenge for statistical prediction with data from multiple sources is that not all the associated covariate data are available for many sampled subjects. Consequently, we need new statistical methodology to handle this type of “fragmentary data” that has become more and more popular in recent years. In this article, we propose a novel method based on the frequentist model averaging that fits some candidate models using all available covariate data. The weights in model averaging are selected by delete-one cross-validation based on the data from complete cases. The optimality of the selected weights is rigorously proved under some conditions. The finite sample performance of the proposed method is confirmed by simulation studies. An example for personal income prediction based on real data from a leading e-community of wealth management in China is also presented for illustration.

Original languageEnglish
Pages (from-to)517-527
Number of pages11
JournalJournal of Business and Economic Statistics
Volume37
Issue number3
DOIs
StatePublished - 3 Jul 2019

Keywords

  • Asymptotic optimality
  • Cross-validation
  • Heteroscedastic errors
  • Linear regression models
  • Multiple data sources

Fingerprint

Dive into the research topics of 'Model Averaging for Prediction With Fragmentary Data'. Together they form a unique fingerprint.

Cite this