Selection of Biologically Relevant Genes with a Wrapper Stochastic Algorithm

Kim-Anh Lê Cao, Université de Toulouse, CNRS (UMR 5219) and INRA
Olivier Gonçalves, LBP UMR CNRS 6023, Blaise Pascal University
Philippe Besse, Université de Toulouse, CNRS (UMR 5219)
Sébastien Gadat, Université de Toulouse, CNRS (UMR 5219)

Abstract

We investigate an important issue of a meta-algorithm for selecting variables in the framework of microarray data. This wrapper method starts from any classification algorithm and weights each variable (i.e. gene) relative to its efficiency for classification. An optimization procedure is then inferred which exhibits important genes for the studied biological process.

Theory and application with the SVM classifier were presented in Gadat and Younes, 2007 and we extend this method with CART. The classification error rates are computed on three famous public databases (Leukemia, Colon and Prostate) and compared with those from other wrapper methods (RFE, lo norm SVM, Random Forests). This allows the assessment of the statistical relevance of the proposed algorithm. Furthermore, a biological interpretation with the Ingenuity Pathway Analysis software outputs clearly shows that the gene selections from the different wrapper methods raise very relevant biological information, compared to a classical filter gene selection with T-test.

Submitted: June 19, 2007 · Accepted: October 1, 2007 · Published: November 6, 2007

Recommended Citation

Lê Cao, Kim-Anh; Gonçalves, Olivier; Besse, Philippe; and Gadat, Sébastien (2007) "Selection of Biologically Relevant Genes with a Wrapper Stochastic Algorithm," Statistical Applications in Genetics and Molecular Biology: Vol. 6 : Iss. 1, Article 29.
Available at: http://www.bepress.com/sagmb/vol6/iss1/art29

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb