跳到主要导航 跳到搜索 跳到主要内容

A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data

  • J. Luo*
  • , M. Schumacher
  • , A. Scherer
  • , D. Sanoudou
  • , D. Megherbi
  • , T. Davison
  • , T. Shi
  • , W. Tong
  • , L. Shi
  • , H. Hong
  • , C. Zhao
  • , F. Elloumi
  • , W. Shi
  • , R. Thomas
  • , S. Lin
  • , G. Tillinghast
  • , G. Liu
  • , Y. Zhou
  • , D. Herman
  • , Y. Li
  • Y. Deng, H. Fang, P. Bushel, M. Woods, J. Zhang
*此作品的通讯作者
  • Systems Analytics Inc.
  • Novartis
  • Spheromics
  • National and Kapodistrian University of Athens
  • University of Massachusetts Lowell
  • Almac Group
  • Chinese Academy of Sciences
  • United States Food and Drug Administration
  • Northeast Forestry University
  • Department of Biochemistry and Biophysics, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill
  • GeneGo Inc.
  • Hamner Institute of Health Sciences
  • Northwestern University
  • Riverside Cancer Care Center
  • R and D Division, SABiosciences Corporation
  • Myeloma Institute for Research and Therapy
  • University of Illinois
  • University of Southern Mississippi
  • ICF International
  • National Institutes of Health

科研成果: 期刊稿件文章同行评审

摘要

Batch effects are the systematic non-biological differences between batches (groups) of samples in microarray experiments due to various causes such as differences in sample preparation and hybridization protocols. Previous work focused mainly on the development of methods for effective batch effects removal. However, their impact on cross-batch prediction performance, which is one of the most important goals in microarray-based applications, has not been addressed. This paper uses a broad selection of data sets from the Microarray Quality Control Phase II (MAQC-II) effort, generated on three microarray platforms with different causes of batch effects to assess the efficacy of their removal. Two data sets from cross-tissue and cross-platform experiments are also included. Of the 120 cases studied using Support vector machines (SVM) and K nearest neighbors (KNN) as classifiers and Matthews correlation coefficient (MCC) as performance metric, we find that Ratio-G, Ratio-A, EJLR, mean-centering and standardization methods perform better or equivalent to no batch effect removal in 89, 85, 83, 79 and 75% of the cases, respectively, suggesting that the application of these methods is generally advisable and ratio-based methods are preferred.

源语言英语
页(从-至)278-291
页数14
期刊Pharmacogenomics Journal
10
4
DOI
出版状态已出版 - 8月 2010
已对外发布

指纹

探究 'A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data' 的科研主题。它们共同构成独一无二的指纹。

引用此