Abstract
Personal credits have always been a hot topic in the society. Among all of them, the evaluation of default risk is particularly concerned since robust estimation, based on personal information, can both help needy individuals to get loans and financial institutions to avoid losses. So far, there have been no good solutions due to limited data, especially default information. With the advent of the era of big data, it is possible to improve the effectiveness of estimates by using auxiliary information from external studies or public domains. However, the individual-level data can not be gained directly because of the emphasis on data privacy; that is, only some summarized statistics with auxiliary information are allowed to be shared. To effectively utilize external integrated auxiliary information to improve the accuracy of default risk estimation, this paper introduces a unified auxiliary information framework, which is referred as enhanced GEE method, to effectively incorporate various external summary results by employing the generalized estimating equations (GEE) approach and augmenting a weighted logarithm of confidence density on GEE function. We establish asymptotic properties for the new method and prove that it can achieve the gain of statistical efficiency compared to the study-specific estimator without any auxiliary information. Besides, a low-cost Map-Reduce procedure for the distributed statistical inference of enhanced GEE method in big data is developed that can achieve the same efficiency as the oracle enhanced GEE approach under mild condition. This method is demonstrated by an application to predict the loan default risk of bank customers in Shanghai and shown to be more effective and reliable compared with the method based on the own data only. Furthermore, the superiorities of our approach, especially the construction of the tighter confidence intervals, are also illustrated with extensive simulation studies and a real personal default risk case.
| Original language | English |
|---|---|
| Pages (from-to) | 2863-2886 |
| Number of pages | 24 |
| Journal | Annals of Applied Statistics |
| Volume | 18 |
| Issue number | 4 |
| DOIs | |
| State | Published - Dec 2024 |
Keywords
- External auxiliary information
- confidence density
- distributed statistical inference
- generalized estimating equations
- individual-level data
Fingerprint
Dive into the research topics of 'INCORPORATING AUXILIARY INFORMATION FOR IMPROVED STATISTICAL INFERENCE AND ITS EXTENSIONS TO DISTRIBUTED ALGORITHMS WITH AN APPLICATION TO PERSONAL CREDIT'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver