Enhancements of communication-efficient distributed statistical inference and its privacy preservation

Research output: Contribution to journalArticlepeer-review

Abstract

In the modern era of big data, the vast amount of available data has brought more ways to analyze important economic and financial issues. For example, predicting the probability of individual default has become more accurate, as the number of defaulted individuals has increased year-on-year with the increase in data volume, leading to a more detailed characterization of the defaulted population. However, it presents new challenges and one of them is that all samples are separately stored in different machines and cannot be transferred directly for privacy considerations and limited data storage capacity. This paper develops an improved communication-efficient distributed algorithm in which more local summarized information is used to estimate the high-order derivatives of the loss function with lower communication cost. Furthermore, to protect the privacy in the interacted vector, we design a privacy-preserving algorithm based on the differential privacy constraint by adding a Laplace-distributed noise term in the parameters that can be extended to other cases beyond distributed architectures. Both non-private and private schemes, in which only local estimators are passed from the local machine to the central machine, are more theoretically and practically accurate and efficient than their counterparts. Then we suggest a bootstrap scheme to estimate the covariance matrix of the parametric estimators that is beneficial to effective inference. Finally, we find that the proposed method can effectively handle the practical activities that are, accurate probabilistic predictions of default risk and climate activity.

Original languageEnglish
Article number106125
JournalJournal of Econometrics
Volume253
DOIs
StatePublished - Jan 2026

Keywords

  • Communication efficiency
  • Differential data privacy
  • Distributed algorithm
  • Laplace mechanism
  • M-estimation

Fingerprint

Dive into the research topics of 'Enhancements of communication-efficient distributed statistical inference and its privacy preservation'. Together they form a unique fingerprint.

Cite this