TY - JOUR
T1 - Structure-adaptive canonical correlation analysis for microbiome multi-omics data
AU - Deng, Linsui
AU - Tang, Yanlin
AU - Zhang, Xianyang
AU - Chen, Jun
N1 - Publisher Copyright:
Copyright © 2024 Deng, Tang, Zhang and Chen.
PY - 2024
Y1 - 2024
N2 - Sparse canonical correlation analysis (sCCA) has been a useful approach for integrating different high-dimensional datasets by finding a subset of correlated features that explain the most correlation in the data. In the context of microbiome studies, investigators are always interested in knowing how the microbiome interacts with the host at different molecular levels such as genome, methylol, transcriptome, metabolome and proteome. sCCA provides a simple approach for exploiting the correlation structure among multiple omics data and finding a set of correlated omics features, which could contribute to understanding the host-microbiome interaction. However, existing sCCA methods do not address compositionality, and its application to microbiome data is thus not optimal. This paper proposes a new sCCA framework for integrating microbiome data with other high-dimensional omics data, accounting for the compositional nature of microbiome sequencing data. It also allows integrating prior structure information such as the grouping structure among bacterial taxa by imposing a “soft” constraint on the coefficients through varying penalization strength. As a result, the method provides significant improvement when the structure is informative while maintaining robustness against a misspecified structure. Through extensive simulation studies and real data analysis, we demonstrate the superiority of the proposed framework over the state-of-the-art approaches.
AB - Sparse canonical correlation analysis (sCCA) has been a useful approach for integrating different high-dimensional datasets by finding a subset of correlated features that explain the most correlation in the data. In the context of microbiome studies, investigators are always interested in knowing how the microbiome interacts with the host at different molecular levels such as genome, methylol, transcriptome, metabolome and proteome. sCCA provides a simple approach for exploiting the correlation structure among multiple omics data and finding a set of correlated omics features, which could contribute to understanding the host-microbiome interaction. However, existing sCCA methods do not address compositionality, and its application to microbiome data is thus not optimal. This paper proposes a new sCCA framework for integrating microbiome data with other high-dimensional omics data, accounting for the compositional nature of microbiome sequencing data. It also allows integrating prior structure information such as the grouping structure among bacterial taxa by imposing a “soft” constraint on the coefficients through varying penalization strength. As a result, the method provides significant improvement when the structure is informative while maintaining robustness against a misspecified structure. Through extensive simulation studies and real data analysis, we demonstrate the superiority of the proposed framework over the state-of-the-art approaches.
KW - canonical correlation analysis
KW - compositional effect
KW - dimension reduction
KW - phylogenetic tree
KW - structural information
KW - variable selection
UR - https://www.scopus.com/pages/publications/85211571263
U2 - 10.3389/fgene.2024.1489694
DO - 10.3389/fgene.2024.1489694
M3 - 文章
AN - SCOPUS:85211571263
SN - 1664-8021
VL - 15
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 1489694
ER -