A Compendium to Ensure Computational Reproducibility in High-Dimensional Classification Tasks

Markus Ruschhaupt, Division of Molecular Genome Analysis, German Cancer Research Centre
Wolfgang Huber, German Cancer Research Center, Heidelberg, Germany
Annemarie Poustka, Division of Molecular Genome Analysis, German Cancer Research Centre
Ulrich Mansmann, Department for Medical Biometrics/Informatics, University of Heidelberg

Abstract

We demonstrate a concept and implementation of a compendium for the classification of high-dimensional data from microarray gene expression profiles. A compendium is an interactive document that bundles primary data, statistical processing methods, figures, and derived data together with the textual documentation and conclusions. Interactivity allows the reader to modify and extend these components. We address the following questions: how much does the discriminatory power of a classifier depend on the choice of the algorithm that was used to identify it; what alternative classifiers could be used just as well; how robust is the result. The answers to these questions are essential prerequisites for validation and biological interpretation of the classifiers. We show how to use this approach by looking at these questions for a specific breast cancer microarray data set that first has been studied by Huang et al. (2003).

Submitted: July 20, 2004 · Accepted: November 30, 2004 · Published: December 19, 2004

Recommended Citation

Ruschhaupt, Markus; Huber, Wolfgang; Poustka, Annemarie; and Mansmann, Ulrich (2004) "A Compendium to Ensure Computational Reproducibility in High-Dimensional Classification Tasks," Statistical Applications in Genetics and Molecular Biology: Vol. 3 : Iss. 1, Article 37.
Available at: http://www.bepress.com/sagmb/vol3/iss1/art37

Related Files

compHuang_1.0.4.tar.gz (146238 kB)
The compendium - source tar-ball (for Unix and MAC)

compHuang_1.0.4.zip (146268 kB)
The compendium - Windows version

MCRestimate_1.0.9.tar.gz (82 kB)
The R package MCRestimate - source tar-ball (for Unix and MAC)

MCRestimate_1.0.9.zip (132 kB)
The R package MCRestimate - Windows version

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb