TY - JOUR
T1 - On cumulative slicing estimation for high dimensional data
AU - Wang, Cheng
AU - Yu, Zhou
AU - Zhu, Liping
N1 - Publisher Copyright:
© 2021 Institute of Statistical Science. All rights reserved.
PY - 2021/1
Y1 - 2021/1
N2 - In the context of sufficient dimension reduction (SDR), the sliced inverse regression (SIR) successfully reduces the covariate dimension of a high-dimensional nonlinear regression. When the covariate is low or moderate dimensional, the performance of the SIR is insensitive to the number of slices. However, our empirical studies indicate that the performance of the SIR relies heavily on the number of slices when the covariate is high or ultrahigh dimensional. Determining the optimal number of slices remains an open problem in the SDR literature, despite its importance to the effectiveness of SIR in high- and ultrahigh-dimensional regressions. Thus, we propose an improved version of the SIR, called the cumulative slicing estimation (CUME) method, that does not require selecting an optimal number of slices. We provide a general framework in which to analyze the phase transitions of the CUME method. We show that, without the sparsity assumption, the CUME method is consistent if and only if p/n → 0, where p denotes the covariate dimension, and n denotes the sample size. If we include certain sparsity assumptions, then the thresholding estimate for the CUME method is consistent as long as log(p)/n → 0. We demonstrate the superior performance of the proposed method using extensive numerical experiments.
AB - In the context of sufficient dimension reduction (SDR), the sliced inverse regression (SIR) successfully reduces the covariate dimension of a high-dimensional nonlinear regression. When the covariate is low or moderate dimensional, the performance of the SIR is insensitive to the number of slices. However, our empirical studies indicate that the performance of the SIR relies heavily on the number of slices when the covariate is high or ultrahigh dimensional. Determining the optimal number of slices remains an open problem in the SDR literature, despite its importance to the effectiveness of SIR in high- and ultrahigh-dimensional regressions. Thus, we propose an improved version of the SIR, called the cumulative slicing estimation (CUME) method, that does not require selecting an optimal number of slices. We provide a general framework in which to analyze the phase transitions of the CUME method. We show that, without the sparsity assumption, the CUME method is consistent if and only if p/n → 0, where p denotes the covariate dimension, and n denotes the sample size. If we include certain sparsity assumptions, then the thresholding estimate for the CUME method is consistent as long as log(p)/n → 0. We demonstrate the superior performance of the proposed method using extensive numerical experiments.
KW - Cumulative slicing estimation
KW - Dimension reduction
KW - Sliced inverse regression
KW - Sparsity
KW - Sufficient
UR - https://www.scopus.com/pages/publications/85105798356
U2 - 10.5705/ss.202018.0381
DO - 10.5705/ss.202018.0381
M3 - 文章
AN - SCOPUS:85105798356
SN - 1017-0405
VL - 31
SP - 223
EP - 242
JO - Statistica Sinica
JF - Statistica Sinica
IS - 1
ER -