Abstract
In genetic and genomic studies, gene-environment (G×E) interactions have important implications. Some of the existing G×E interaction methods are limited by analyzing a small number of G factors at a time, by assuming linear effects of E factors, by assuming no data contamination, and by adopting ineffective selection techniques. In this study, we propose a new approach for identifying important G×E interactions. It jointly models the effects of all E and G factors and their interactions. A partially linear varying coefficient model is adopted to accommodate possible nonlinear effects of E factors. A rank-based loss function is used to accommodate possible data contamination. Penalization, which has been extensively used with high-dimensional data, is adopted for selection. The proposed penalized estimation approach can automatically determine if a G factor has an interaction with an E factor, main effect but not interaction, or no effect at all. The proposed approach can be effectively realized using a coordinate descent algorithm. Simulation shows that it has satisfactory performance and outperforms several competing alternatives. The proposed approach is used to analyze a lung cancer study with gene expression measurements and clinical variables.
| Original language | English |
|---|---|
| Pages (from-to) | 4016-4030 |
| Number of pages | 15 |
| Journal | Statistics in Medicine |
| Volume | 34 |
| Issue number | 30 |
| DOIs | |
| State | Published - 30 Dec 2015 |
| Externally published | Yes |
Keywords
- gene–environment interactions
- partially linear varying coefficient model
- penalized selection
- robustness