Treating Expression Levels of Different Genes as a Sample in Microarray Data Analysis: Is it Worth a Risk?

Lev Klebanov, Department of Probability and Statistics, Charles University
Andrei Yakovlev, University of Rochester, Rochester, NY

Abstract

One of the prevailing ideas in the literature on microarray data analysis is to pool the expression measures across genes and treat them as a sample drawn from some distribution. Several universal laws were proposed to analytically describe this distribution. This idea raises a number of concerns. The expression levels of genes are not identically distributed random variables so that treating them as a sample amounts to sampling from a mixture of equally weighted distributions, each being associated with a different gene. The expression levels of different genes are heavily dependent random variables so that the law of large numbers and statistical goodness-of-fit tests are normally inapplicable to this kind of data. This dependence represents a very serious pitfall in microarray data analysis.

Submitted: October 12, 2005 · Accepted: March 17, 2006 · Published: March 24, 2006

Recommended Citation

Klebanov, Lev and Yakovlev, Andrei (2006) "Treating Expression Levels of Different Genes as a Sample in Microarray Data Analysis: Is it Worth a Risk?," Statistical Applications in Genetics and Molecular Biology: Vol. 5 : Iss. 1, Article 9.
DOI: 10.2202/1544-6115.1185
Available at: http://www.bepress.com/sagmb/vol5/iss1/art9

 
 
 
 

ISSN: 1544-6115 ©1999-2009 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb