Sparse Logistic Regression with Lp Penalty for Biomarker Identification

Zhenqiu Liu, University of Maryland
Feng Jiang, University of Maryland
Guoliang Tian, University of Maryland
Suna Wang, University of Maryland School of Medicine
Fumiaki Sato, Johns Hopkins University School of Medicine
Stephen J. Meltzer, Johns Hopkins University School of Medicine
Ming Tan, University of Maryland Greenebaum Cancer Center

Abstract

In this paper, we propose a novel method for sparse logistic regression with non-convex regularization Lp (p <1). Based on smooth approximation, we develop several fast algorithms for learning the classifier that is applicable to high dimensional dataset such as gene expression. To the best of our knowledge, these are the first algorithms to perform sparse logistic regression with an Lp and elastic net (Le) penalty. The regularization parameters are decided through maximizing the area under the ROC curve (AUC) of the test data. Experimental results on methylation and microarray data attest the accuracy, sparsity, and efficiency of the proposed algorithms. Biomarkers identified with our methods are compared with that in the literature. Our computational results show that Lp Logistic regression (p <1) outperforms the L1 logistic regression and SCAD SVM. Software is available upon request from the first author.

Submitted: August 18, 2006 · Accepted: January 12, 2007 · Published: February 10, 2007

Recommended Citation

Liu, Zhenqiu; Jiang, Feng; Tian, Guoliang; Wang, Suna; Sato, Fumiaki; Meltzer, Stephen J.; and Tan, Ming (2007) "Sparse Logistic Regression with Lp Penalty for Biomarker Identification," Statistical Applications in Genetics and Molecular Biology: Vol. 6 : Iss. 1, Article 6.
Available at: http://www.bepress.com/sagmb/vol6/iss1/art6

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb