On cumulative slicing estimation for high dimensional data

  • Cheng Wang
  • , Zhou Yu
  • , Liping Zhu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

In the context of sufficient dimension reduction (SDR), the sliced inverse regression (SIR) successfully reduces the covariate dimension of a high-dimensional nonlinear regression. When the covariate is low or moderate dimensional, the performance of the SIR is insensitive to the number of slices. However, our empirical studies indicate that the performance of the SIR relies heavily on the number of slices when the covariate is high or ultrahigh dimensional. Determining the optimal number of slices remains an open problem in the SDR literature, despite its importance to the effectiveness of SIR in high- and ultrahigh-dimensional regressions. Thus, we propose an improved version of the SIR, called the cumulative slicing estimation (CUME) method, that does not require selecting an optimal number of slices. We provide a general framework in which to analyze the phase transitions of the CUME method. We show that, without the sparsity assumption, the CUME method is consistent if and only if p/n → 0, where p denotes the covariate dimension, and n denotes the sample size. If we include certain sparsity assumptions, then the thresholding estimate for the CUME method is consistent as long as log(p)/n → 0. We demonstrate the superior performance of the proposed method using extensive numerical experiments.

Original languageEnglish
Pages (from-to)223-242
Number of pages20
JournalStatistica Sinica
Volume31
Issue number1
DOIs
StatePublished - Jan 2021

Keywords

  • Cumulative slicing estimation
  • Dimension reduction
  • Sliced inverse regression
  • Sparsity
  • Sufficient

Fingerprint

Dive into the research topics of 'On cumulative slicing estimation for high dimensional data'. Together they form a unique fingerprint.

Cite this