Robust Remote Homology Detection by Feature Based Profile Hidden Markov Models
Abstract
The detection of remote homologies is of major importance for
molecular biology applications like drug discovery. The
problem is still very challenging even for state-of-the-art
probabilistic models of protein families, namely Profile HMMs. In
order to improve remote homology detection we propose feature based
semi-continuous Profile HMMs. Based on a richer sequence
representation consisting of features which capture the biochemical
properties of residues in their local context, family specific
semi-continuous models are estimated completely
data-driven. Additionally, for substantially reducing the number of
false predictions an explicit rejection model is estimated. Both the
family specific semi-continuous Profile HMM and the non-target model
are competitively evaluated.
In the experimental evaluation of superfamily based screening of the
SCOP database we demonstrate that semi-continuous Profile HMMs
significantly outperform their discrete counterparts. Using the
rejection model the number of false positive predictions could be
reduced substantially which is an important prerequisite for
target identification applications.
Submitted: May 18, 2005 · Accepted: August 24, 2005 · Published: September 6, 2005
Recommended Citation
Plötz, Thomas and Fink, Gernot A.
(2005)
"Robust Remote Homology Detection by Feature Based Profile Hidden Markov Models,"
Statistical Applications in Genetics and Molecular Biology:
Vol. 4
:
Iss.
1, Article 21.
Available at: http://www.bepress.com/sagmb/vol4/iss1/art21
