Signal Classification in Large-Scale Multi-Sequence Integrative Analysis Under the HMM Dependence

Research output: Contribution to journalArticlepeer-review

Abstract

The integrative analysis of multiple sequences of multiple tests has enjoyed increasing popularity in many applications, especially in large-scale genomics. In the context of large-scale multiple testing, the concept of signal classification has been developed recently for cases when the same features are involved in several independent studies, with the goal of classifying each feature into one of several classes. This article considers the problem of such signal classification in a generalized compound decision-making framework, where the observed data are assumed to be generated from an underlying four-state Cartesian hidden Markov model. Two oracle procedures are proposed for the total and set-specific control of misclassification rates, respectively, while the number of correct classifications is maximized. Optimal data-driven procedures are also proposed, with their asymptotic properties derived. It is shown that signal-classification could be improved significantly by taking into account the dependence structure among features, and the proposed procedures could have a better performance than their competitors that ignore the dependence structure. The proposed methods are applied to a psychiatric genetics study for detecting genetic variants that affect either or both of bipolar disorder and schizophrenia.

Original languageEnglish
Pages (from-to)182-195
Number of pages14
JournalTechnometrics
Volume66
Issue number2
DOIs
StatePublished - 2024

Keywords

  • Generalized local significance index
  • Hidden Markov model
  • Integrative analysis
  • Signal classification under dependence

Fingerprint

Dive into the research topics of 'Signal Classification in Large-Scale Multi-Sequence Integrative Analysis Under the HMM Dependence'. Together they form a unique fingerprint.

Cite this