Which test for crossing survival curves? A user’s guideline

  • Ina Dormuth*
  • , Tiantian Liu
  • , Jin Xu
  • , Menggang Yu
  • , Markus Pauly
  • , Marc Ditzhaus
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

40 Scopus citations

Abstract

Background: The exchange of knowledge between statisticians developing new methodology and clinicians, reviewers or authors applying them is fundamental. This is specifically true for clinical trials with time-to-event endpoints. Thereby, one of the most commonly arising questions is that of equal survival distributions in two-armed trial. The log-rank test is still the gold-standard to infer this question. However, in case of non-proportional hazards, its power can become poor and multiple extensions have been developed to overcome this issue. We aim to facilitate the choice of a test for the detection of survival differences in the case of crossing hazards. Methods: We restricted the review to the most recent two-armed clinical oncology trials with crossing survival curves. Each data set was reconstructed using a state-of-the-art reconstruction algorithm. To ensure reproduction quality, only publications with published number at risk at multiple time points, sufficient printing quality and a non-informative censoring pattern were included. This article depicts the p-values of the log-rank and Peto-Peto test as references and compares them with nine different tests developed for detection of survival differences in the presence of non-proportional or crossing hazards. Results: We reviewed 1400 recent phase III clinical oncology trials and selected fifteen studies that met our eligibility criteria for data reconstruction. After including further three individual patient data sets, for nine out of eighteen studies significant differences in survival were found using the investigated tests. An important point that reviewers should pay attention to is that 28% of the studies with published survival curves did not report the number at risk. This makes reconstruction and plausibility checks almost impossible. Conclusions: The evaluation shows that inference methods constructed to detect differences in survival in presence of non-proportional hazards are beneficial and help to provide guidance in choosing a sensible alternative to the standard log-rank test.

Original languageEnglish
Article number34
JournalBMC Medical Research Methodology
Volume22
Issue number1
DOIs
StatePublished - Dec 2022

Keywords

  • Crossing
  • Log-rank test
  • Non-proportional hazards
  • Oncology
  • Restricted-mean survival
  • Survival analysis
  • Time-to-event outcome

Fingerprint

Dive into the research topics of 'Which test for crossing survival curves? A user’s guideline'. Together they form a unique fingerprint.

Cite this