A Simple Divide-and-Conquer-based Distributed Method for the Accelerated Failure Time Model

  • Lanjue Chen
  • , Jin Su*
  • , Alan T.K. Wan*
  • , Yong Zhou
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

The accelerated failure time (AFT) model is an appealing tool in survival analysis because of its ease of interpretation, but when there is a large volume of data, fitting an AFT model and carrying out the associated inference on one computer can be computationally demanding. This poses a severe limitation for the application of the AFT model in the face of big data. The article addresses this problem by developing a simple distributed method for estimating the parameters of an AFT model based on the divide-and-conquer strategy, which has the dual benefits of statistical efficiency and computational economy. It is an iterative method that involves, for the most part, some rather simple algebraic operations, except for obtaining the initial estimate, which is based on a smoothed approximation of the Gehan estimating equation. Our results show that the proposed method yields estimates that converge after a few iterations and an estimator that is asymptotically as efficient as the benchmark estimator obtained by using the full data in one go. We also develop an associated inference procedure. The merits of the proposed method are demonstrated via an extensive simulation study. The method is applied to a kidney transplantation dataset. Supplementary materials for this article are available online.

Original languageEnglish
Pages (from-to)681-698
Number of pages18
JournalJournal of Computational and Graphical Statistics
Volume33
Issue number2
DOIs
StatePublished - 2024

Keywords

  • Accelerated failure time model
  • Algorithm
  • Big data
  • Distributed inference
  • Divide-and-conquer
  • Gehan estimating equation

Fingerprint

Dive into the research topics of 'A Simple Divide-and-Conquer-based Distributed Method for the Accelerated Failure Time Model'. Together they form a unique fingerprint.

Cite this