On the Dependence Structure of Sequence Alignment Scores Calculated with Multiple Scoring Matrices

Florian Frommlet, Medical University of Vienna
Andreas Futschik, Univ. of Vienna

Abstract

A common practice in protein sequence alignment is to try several scoring matrices until ``something interesting'' is found. This leads to a multiple testing problem making p- and E-values hard to interpret. We focus on local alignment and propose to use logistic copula functions to model explicitly the dependence structure of scores obtained using different scoring matrices. By doing this, we obtain p-value correction factors when using more than one scoring matrix on the same sequences. Furthermore the parameter of the logistic copula can be interpreted as measure of dependence, providing insight concerning the relatedness of the scores from different matrices.

Submitted: June 18, 2004 · Accepted: September 30, 2004 · Published: October 5, 2004

Recommended Citation

Frommlet, Florian and Futschik, Andreas (2004) "On the Dependence Structure of Sequence Alignment Scores Calculated with Multiple Scoring Matrices," Statistical Applications in Genetics and Molecular Biology: Vol. 3 : Iss. 1, Article 24.
Available at: http://www.bepress.com/sagmb/vol3/iss1/art24

 
 
 
 

ISSN: 1544-6115 ©1999-2008 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb