A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics

Juliane Schäfer, Department of Statistics, University of Munich, Germany
Korbinian Strimmer, Department of Statistics, University of Munich, Germany

Abstract

Inferring large-scale covariance matrices from sparse genomic data is an ubiquitous problem in bioinformatics. Clearly, the widely used standard covariance and correlation estimators are ill-suited for this purpose. As statistically efficient and computationally fast alternative we propose a novel shrinkage covariance estimator that exploits the Ledoit-Wolf (2003) lemma for analytic calculation of the optimal shrinkage intensity.

Subsequently, we apply this improved covariance estimator (which has guaranteed minimum mean squared error, is well-conditioned, and is always positive definite even for small sample sizes) to the problem of inferring large-scale gene association networks. We show that it performs very favorably compared to competing approaches both in simulations as well as in application to real expression data.

Submitted: August 12, 2005 · Accepted: October 25, 2005 · Published: November 14, 2005

Recommended Citation

Schäfer, Juliane and Strimmer, Korbinian (2005) "A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics," Statistical Applications in Genetics and Molecular Biology: Vol. 4 : Iss. 1, Article 32.
Available at: http://www.bepress.com/sagmb/vol4/iss1/art32

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb