跳到主要导航 跳到搜索 跳到主要内容

Communication-efficient Distributed Statistical Inference for Massive Data with Heterogeneous Auxiliary Information

  • CAS - Academy of Mathematics and System Sciences
  • Chengdu No.7 High School

科研成果: 期刊稿件文章同行评审

摘要

Heterogeneous auxiliary information commonly arises in big data due to diverse study settings and privacy constraints. Excluding such indirect evidence often results in a substantial loss of statistical inference efficiency. This article proposes a novel framework for integrating a mixture of individual-level data and multiple external heterogeneous summary statistics by multiplying likelihood functions and confidence densities. Theoretically, we show that the proposed method possesses desirable properties and can achieve statistical efficiency comparable to that of the individual participant data (IPD) estimator, which uses all available individual-level data. Furthermore, we develop a communication-efficient distributed inference procedure for massive datasets with heterogeneous auxiliary information. We demonstrate that the proposed iterative algorithm achieves linear convergence under general conditions or generalized linear models. Finally, extensive simulations and real data applications are conducted to illustrate the performance of the proposed methods.

源语言英语
文章编号28
期刊Journal of Machine Learning Research
27
出版状态已出版 - 2026

指纹

探究 'Communication-efficient Distributed Statistical Inference for Massive Data with Heterogeneous Auxiliary Information' 的科研主题。它们共同构成独一无二的指纹。

引用此