Deletion/Substitution/Addition Algorithm in Learning with Applications in Genomics

Sandra E. Sinisi, Division of Biostatistics, School of Public Health, University of California, Berkeley
Mark J. van der Laan, Division of Biostatistics, School of Public Health, University of California, Berkeley

Abstract

van der Laan and Dudoit (2003) provide a road map for estimation and performance assessment where a parameter of interest is defined as the risk minimizer for a suitable loss function and candidate estimators are generated using a loss function. After briefly reviewing this approach, this article proposes a general deletion/substitution/addition algorithm for minimizing, over subsets of variables (e.g., basis functions), the empirical risk of subset-specific estimators of the parameter of interest. This algorithm provides us with a new class of loss-based cross-validated algorithms in prediction of univariate outcomes, which can be extended to handle multivariate outcomes, conditional density and hazard estimation, and censored outcomes such as survival. In the context of regression, using polynomial basis functions, we study the properties of the deletion/substitution/addition algorithm in simulations and apply the method to detect transcription factor binding sites in yeast gene expression experiments.

Submitted: June 6, 2004 · Accepted: July 29, 2004 · Published: August 12, 2004

Recommended Citation

Sinisi, Sandra E. and van der Laan, Mark J. (2004) "Deletion/Substitution/Addition Algorithm in Learning with Applications in Genomics," Statistical Applications in Genetics and Molecular Biology: Vol. 3 : Iss. 1, Article 18.
Available at: http://www.bepress.com/sagmb/vol3/iss1/art18

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb