TY - JOUR
T1 - A Simple Divide-and-Conquer-based Distributed Method for the Accelerated Failure Time Model
AU - Chen, Lanjue
AU - Su, Jin
AU - Wan, Alan T.K.
AU - Zhou, Yong
N1 - Publisher Copyright:
© 2023 American Statistical Association and Institute of Mathematical Statistics.
PY - 2024
Y1 - 2024
N2 - The accelerated failure time (AFT) model is an appealing tool in survival analysis because of its ease of interpretation, but when there is a large volume of data, fitting an AFT model and carrying out the associated inference on one computer can be computationally demanding. This poses a severe limitation for the application of the AFT model in the face of big data. The article addresses this problem by developing a simple distributed method for estimating the parameters of an AFT model based on the divide-and-conquer strategy, which has the dual benefits of statistical efficiency and computational economy. It is an iterative method that involves, for the most part, some rather simple algebraic operations, except for obtaining the initial estimate, which is based on a smoothed approximation of the Gehan estimating equation. Our results show that the proposed method yields estimates that converge after a few iterations and an estimator that is asymptotically as efficient as the benchmark estimator obtained by using the full data in one go. We also develop an associated inference procedure. The merits of the proposed method are demonstrated via an extensive simulation study. The method is applied to a kidney transplantation dataset. Supplementary materials for this article are available online.
AB - The accelerated failure time (AFT) model is an appealing tool in survival analysis because of its ease of interpretation, but when there is a large volume of data, fitting an AFT model and carrying out the associated inference on one computer can be computationally demanding. This poses a severe limitation for the application of the AFT model in the face of big data. The article addresses this problem by developing a simple distributed method for estimating the parameters of an AFT model based on the divide-and-conquer strategy, which has the dual benefits of statistical efficiency and computational economy. It is an iterative method that involves, for the most part, some rather simple algebraic operations, except for obtaining the initial estimate, which is based on a smoothed approximation of the Gehan estimating equation. Our results show that the proposed method yields estimates that converge after a few iterations and an estimator that is asymptotically as efficient as the benchmark estimator obtained by using the full data in one go. We also develop an associated inference procedure. The merits of the proposed method are demonstrated via an extensive simulation study. The method is applied to a kidney transplantation dataset. Supplementary materials for this article are available online.
KW - Accelerated failure time model
KW - Algorithm
KW - Big data
KW - Distributed inference
KW - Divide-and-conquer
KW - Gehan estimating equation
UR - https://www.scopus.com/pages/publications/85173743106
U2 - 10.1080/10618600.2023.2252028
DO - 10.1080/10618600.2023.2252028
M3 - 文章
AN - SCOPUS:85173743106
SN - 1061-8600
VL - 33
SP - 681
EP - 698
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
IS - 2
ER -