Combining Nearest Neighbor Classifiers Versus Cross-Validation Selection

Minhui Paik, Iowa State University
Yuhong Yang, Iowa State University

Abstract

Various discriminant methods have been applied for classification of tumors based on gene expression profiles, among which the nearest neighbor (NN) method has been reported to perform relatively well. Usually cross-validation (CV) is used to select the neighbor size as well as the number of variables for the NN method. However, CV can perform poorly when there is considerable uncertainty in choosing the best candidate classifier. As an alternative to selecting a single ``winner,'' we propose a weighting method to combine the multiple NN rules. Four gene expression data sets are used to compare its performance with CV methods. The results show that when the CV selection is unstable, the combined classifier performs much better.

Submitted: April 5, 2004 · Accepted: June 1, 2004 · Published: June 9, 2004

Recommended Citation

Paik, Minhui and Yang, Yuhong (2004) "Combining Nearest Neighbor Classifiers Versus Cross-Validation Selection," Statistical Applications in Genetics and Molecular Biology: Vol. 3 : Iss. 1, Article 12.
Available at: http://www.bepress.com/sagmb/vol3/iss1/art12

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb