Validation and Discovery in Markov Models of Genetics Data

Victor De Gruttola, Harvard School of Public Health
Andrea S. Foulkes, University of MA

Abstract

Markov models provide a natural framework for modeling cellular and molecular level changes over time. Kalbfleisch and Lawless propose using a Chi-squared statistic for assessing the appropriateness of assuming a first-order, homogeneous Markov process. While this statistic provides a global test of the Markov assumption, it does not permit identification of individual departures. We consider two approaches for discovering specific departures from the Markov assumption. First, we propose a diagnostic that tests whether the number of observed transitions out of a given state at a given time point is different than expected. Second, we construct statistics based on the number of observations in each state at each time point. In both cases, we construct multiple correlated statistics and testing is achieved through simulations. These approaches are applied to HIV genetics sequences measured over time.

Submitted: October 20, 2004 · Accepted: December 11, 2004 · Published: December 27, 2004

Recommended Citation

De Gruttola, Victor and Foulkes, Andrea S. (2004) "Validation and Discovery in Markov Models of Genetics Data," Statistical Applications in Genetics and Molecular Biology: Vol. 3 : Iss. 1, Article 38.
DOI: 10.2202/1544-6115.1104
Available at: http://www.bepress.com/sagmb/vol3/iss1/art38

 
 
 
 

ISSN: 1544-6115 ©1999-2009 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb