A new genotype calling method for Affymetrix SNP arrays

  • Bilin Fu
  • , Jin Xu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Current genotype-calling methods such as Robust Linear Model with Mahalanobis Distance Classifier (RLMM) and Corrected Robust Linear Model with Maximum Likelihood Classification (CRLMM) provide accurate calling results for Affymetrix Single Nucleotide Polymorphisms (SNP) chips. However, these methods are computationally expensive as they employ preprocess procedures, including chip data normalization and other sophisticated statistical techniques. In the small sample case the accuracy rate may drop significantly. We develop a new genotype calling method for Affymetrix 100 k and 500 k SNP chips. A two-stage classification scheme is proposed to obtain a fast genotype calling algorithm. The first stage uses unsupervised classification to quickly discriminate genotypes with high accuracy for the majority of the SNPs. And the second stage employs a supervised classification method to incorporate allele frequency information either from the HapMap data or from a self-training scheme. Confidence score is provided for every genotype call. The overall performance is shown to be comparable to that of CRLMM as verified by the known gold standard HapMap data and is superior in small sample cases. The new algorithm is computationally simple and standalone in the sense that a self-training scheme can be used without employing any other training data. A package implementing the calling algorithm is freely available at .

Original languageEnglish
Pages (from-to)715-728
Number of pages14
JournalJournal of Bioinformatics and Computational Biology
Volume9
Issue number6
DOIs
StatePublished - Dec 2011

Keywords

  • Mahalanobis distance
  • SNP chip
  • genotyping

Fingerprint

Dive into the research topics of 'A new genotype calling method for Affymetrix SNP arrays'. Together they form a unique fingerprint.

Cite this