TY - JOUR
T1 - A General Framework for Identifying Hierarchical Interactions and Its Application to Genomics Data
AU - Zhang, Xiao
AU - Shi, Xingjie
AU - Liu, Yiming
AU - Liu, Xu
AU - Ma, Shuangge
N1 - Publisher Copyright:
© 2023 American Statistical Association and Institute of Mathematical Statistics.
PY - 2023
Y1 - 2023
N2 - The analysis of hierarchical interactions has long been a challenging problem due to the large number of candidate main effects and interaction effects, and the need for accommodating the “main effects, interactions” hierarchy. The two-stage analysis methods enjoy simplicity and low computational cost, but contradict the fact that the outcome of interest is attributable to the joint effects of multiple main factors and their interactions. The existing joint analysis methods can accurately describe the underlying data generating process, but suffer from prohibitively high computational cost. And it is not straightforward to extend their optimization algorithms to general loss functions. To address this need, we develop a new computational method that is much faster than the existing joint analysis methods and rivals the runtimes of two-stage analysis. The proposed method, (Formula presented.), adopts the framework of the forward and backward stagewise algorithm and enjoys computational efficiency and broad applicability. To accommodate hierarchy without imposing additional constraints, it has newly developed forward and backward steps. It naturally accommodates the strong and weak hierarchy, and makes optimization much simpler and faster than in the existing studies. Optimality of (Formula presented.) sequences is investigated theoretically. Simulations show that it outperforms the existing methods. The analysis of TCGA data on melanoma demonstrates its competitive practical performance. Supplementary materials for this article are available online.
AB - The analysis of hierarchical interactions has long been a challenging problem due to the large number of candidate main effects and interaction effects, and the need for accommodating the “main effects, interactions” hierarchy. The two-stage analysis methods enjoy simplicity and low computational cost, but contradict the fact that the outcome of interest is attributable to the joint effects of multiple main factors and their interactions. The existing joint analysis methods can accurately describe the underlying data generating process, but suffer from prohibitively high computational cost. And it is not straightforward to extend their optimization algorithms to general loss functions. To address this need, we develop a new computational method that is much faster than the existing joint analysis methods and rivals the runtimes of two-stage analysis. The proposed method, (Formula presented.), adopts the framework of the forward and backward stagewise algorithm and enjoys computational efficiency and broad applicability. To accommodate hierarchy without imposing additional constraints, it has newly developed forward and backward steps. It naturally accommodates the strong and weak hierarchy, and makes optimization much simpler and faster than in the existing studies. Optimality of (Formula presented.) sequences is investigated theoretically. Simulations show that it outperforms the existing methods. The analysis of TCGA data on melanoma demonstrates its competitive practical performance. Supplementary materials for this article are available online.
KW - Forward and backward stagewise
KW - High-dimensional modeling
KW - Interaction analysis
KW - Lasso
KW - Penalized selection
UR - https://www.scopus.com/pages/publications/85147686102
U2 - 10.1080/10618600.2022.2152034
DO - 10.1080/10618600.2022.2152034
M3 - 文章
AN - SCOPUS:85147686102
SN - 1061-8600
VL - 32
SP - 873
EP - 883
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
IS - 3
ER -