Importance Sampling for the Infinite Sites Model

Asger Hobolth, Aarhus University
Marcy K. Uyenoyama, Duke University
Carsten Wiuf, Aarhus University

Abstract

Importance sampling or Markov Chain Monte Carlo sampling is required for state-of-the-art statistical analysis of population genetics data. The applicability of these sampling-based inference techniques depends crucially on the proposal distribution. In this paper, we discuss importance sampling for the infinite sites model. The infinite sites assumption is attractive because it constraints the number of possible genealogies, thereby allowing for the analysis of larger data sets. We recall the Griffiths-Tavaré and Stephens-Donnelly proposals and emphasize the relation between the latter proposal and exact sampling from the infinite alleles model. We also introduce a new proposal that takes knowledge of the ancestral state into account. The new proposal is derived from a new result on exact sampling from a single site. The methods are illustrated on simulated data sets and the data considered in Griffiths and Tavaré (1994).

Submitted: July 25, 2008 · Accepted: September 4, 2008 · Published: October 30, 2008

Recommended Citation

Hobolth, Asger; Uyenoyama, Marcy K.; and Wiuf, Carsten (2008) "Importance Sampling for the Infinite Sites Model," Statistical Applications in Genetics and Molecular Biology: Vol. 7 : Iss. 1, Article 32.
DOI: 10.2202/1544-6115.1400
Available at: http://www.bepress.com/sagmb/vol7/iss1/art32

 
 
 
 

ISSN: 1544-6115 ©1999-2009 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/sagmb