A Cross-Validation Study to Select a Classification Procedure for Clinical Diagnosis Based on Proteomic Mass Spectrometry

Dirk Valkenborg, Hasselt University, Center for Statistics
Suzy Van Sanden, Hasselt University, Center for Statistics
Dan Lin, Hasselt University, Center for Statistics
Adetayo Kasim, Hasselt University, Center for Statistics
Qi Zhu, Hasselt University, Center for Statistics
Philippe Haldermans, Hasselt University, Center for Statistics
Ivy Jansen, Hasselt University, Center for Statistics
Ziv Shkedy, Hasselt University, Center for Statistics
Tomasz Burzykowski, Hasselt University, Center for Statistics

Abstract

We present an approach to construct a classification rule based on the mass spectrometry data provided by the organizers of the "Classification Competition on Clinical Mass Spectrometry Proteomic Diagnosis Data." Before constructing a classification rule, we attempted to pre-process the data and to select features of the spectra that were likely due to true biological signals (i.e., peptides/proteins). As a result, we selected a set of 92 features. To construct the classification rule, we considered eight methods for selecting a subset of the features, combined with seven classification methods. The performance of the resulting 56 combinations was evaluated by using a cross-validation procedure with 1000 re-sampled data sets. The best result, as indicated by the lowest overall misclassification rate, was obtained by using the whole set of 92 features as the input for a support-vector machine (SVM) with a linear kernel. This method was therefore used to construct the classification rule. For the training data set, the total error rate for the classification rule, as estimated by using leave-one-out cross-validation, was equal to 0.16, with the sensitivity and specificity equal to 0.87 and 0.82, respectively.

Submitted: February 11, 2008 · Accepted: February 20, 2008 · Published: March 24, 2008

Recommended Citation

Valkenborg, Dirk; Van Sanden, Suzy; Lin, Dan; Kasim, Adetayo; Zhu, Qi; Haldermans, Philippe; Jansen, Ivy; Shkedy, Ziv; and Burzykowski, Tomasz (2008) "A Cross-Validation Study to Select a Classification Procedure for Clinical Diagnosis Based on Proteomic Mass Spectrometry," Statistical Applications in Genetics and Molecular Biology: Vol. 7 : Iss. 2, Article 12.
Available at: http://www.bepress.com/sagmb/vol7/iss2/art12

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb