Opinion
Three roads diverged? Routes to phylogeographic inference

https://doi.org/10.1016/j.tree.2010.08.010Get rights and content

Phylogeographic methods facilitate inference of the geographical history of genetic lineages. Recent examples explore human migration and the origins of viral pandemics. There is longstanding disagreement over the use and validity of certain phylogeographic inference methodologies. In this paper, we highlight three distinct frameworks for phylogeographic inference to give a taste of this disagreement. Each of the three approaches presents a different viewpoint on phylogeography, most fundamentally on how we view the relationship between the inferred history of a sample and the history of the population the sample is embedded in. Satisfactory resolution of this relationship between history of the tree and history of the population remains a challenge for all but the most trivial models of phylogeographic processes. Intriguingly, we believe that some recent methods that entirely avoid inference about the history of the population will eventually help to reach a resolution.

Section snippets

Emerging pathways of phylogeographic inference

The influence of phylogeography is spreading throughout biology. Among other examples, phylogeographic techniques have enabled us to infer the origins of mice [1], modern humans 2, 3, and man's “best friend”, the domesticated dog [4]. Phylogeographic analyses also enable public health officials to understand the origin and spread of emerging infectious diseases 5, 6, 7, 8. In spite of these successes, disagreement and confusion persist regarding the most effective ways to learn about

Comparative approach

For almost a decade, supporters of NCPA [14] and of model-based phylogeographic methods [20] have argued over the merits of each method. Similar to previous debate in phylogenetics, this has generated several positive outcomes; most importantly, investigators now vigorously question assumptions when analyzing geographic, demographic and evolutionary data [21]. Nevertheless, this long-standing discussion has introduced considerable confusion 22, 23. However, the debate now seems to be coming to

Spatial diffusion approach

As an alternative to the comparative method and its Bayesian extensions, we now highlight recent developments towards model-based approaches that take a probabilistic perspective on spatial diffusion 15, 16. Although much of this work was implemented as part of a comprehensive statistical inference package that led to fruitful advances in demographic models 42, 43, we want to emphasize that, unlike spatial coalescent approaches [44], phylogenetic diffusion models do not infer population-based

Population genetics approach

By far the most popular statistical approaches to phylogeography rely on the structured-coalescent framework 9, 44, 61, 62. In general, these methods assume that evolutionary trees are random draws from some underlying population-level process [9]. Essentially, population-level processes fossilize their histories as evolutionary trees that we indirectly view through molecular sequence and other data 10, 61. These processes include selection, migration, population size changes and recombination.

Three routes to the same destination

With the expansion of phylogeography in new Bayesian directions involving NCPA and spatial diffusion, the field might seem to be fragmenting. However, we believe all three approaches address the same basic question posed at the beginning of this article: what are the most effective ways to learn about phylogeographic processes from geospatially identified molecular sequence data? Essentially, we believe that all three frameworks produce effective answers, if we know what questions to ask. Under

Acknowledgments

We thank Chris Simon, Allen Rodrigo, Brian C. Carstens, H. Lisle Gibbs, Laura S. Kubatko, Peter Beerli and Ioanna Manolopoulou for their comments and suggestions. We also thank the National Evolutionary Synthesis Center (NSF #EF-0423641) for fostering our collaboration. Alexei J. Drummond contributed greatly to discussions on an earlier version of this paper. EWB is supported in part by the National Science Foundation under Agreement No. 0635561. PL is supported by a postdoctoral fellowship

Glossary

Ancestral history
any information about the direct ancestors of a sample of molecular sequences. This term can refer, for example, to inferred sequence composition or phenotype, such as geography, and is often associated with a time scale.
Approximate Bayesian computation (ABC)
simulation technique used to draw statistical inference based on data summaries, often used when computation of the full data likelihood is impractical.
Bayes factor
ratio of the marginal likelihoods of a given data set

References (86)

  • J. Searle

    Of mice and (Viking?) men: phylogeography of British and Irish house mice

    Proc. R. Soc. B Biol. Sci.

    (2009)
  • N. Fagundes

    Statistical evaluation of alternative models of human evolution

    Proc. Natl. Acad. Sci. U. S. A

    (2007)
  • J. Li

    Worldwide human relationships inferred from genome-wide patterns of variation

    Science

    (2008)
  • B. von Holdt

    Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication

    Nature

    (2010)
  • R. Biek

    A virus reveals population structure and recent demographic history of its carnivore host

    Science

    (2006)
  • R. Biek

    A high-resolution genetic signature of demographic and spatial expansion in epizootic rabies virus

    Proc. Natl. Acad. Sci. U. S. A.

    (2007)
  • G. Smith

    Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic

    Nature

    (2009)
  • P. Lemey

    Reconstructing the initial global spread of a human influenza pandemic: a Bayesian spatial--temporal model for the global spread of H1N1

    PLoS Curr. Influenza

    (2009)
  • R. Nielsen et al.

    Statistical inferences in phylogeography

    Mol. Ecol.

    (2009)
  • L. Knowles

    Statistical phylogeography

    Annu. Rev. Ecol. Evol. Syst.

    (2009)
  • J. Avise

    Phylogeography: retrospect and prospect

    J. Biogeog.

    (2009)
  • M. Hickerson

    Phylogeography's past, present, and future: 10 years after Avise, 2000

    Mol. Phylogenet. Evol.

    (2010)
  • S. Brooks

    Assessing the effect of genetic mutation: a Bayesian framework for determining population history from DNA sequence data

  • A. Templeton

    Coalescent-based, maximum likelihood inference in phylogeography

    Mol. Ecol.

    (2010)
  • P. Lemey

    Bayesian phylogeography finds its roots

    PLoS Comput. Biol.

    (2009)
  • P. Lemey

    Phylogeography takes a relaxed random walk in continuous space and time

    Mol. Biol. Evol.

    (2010)
  • J. Hey

    Isolation with migration models for more than two populations

    Mol. Biol. Evol.

    (2010)
  • J. Heled et al.

    Bayesian inference of species trees from multilocus data

    Mol. Biol. Evol.

    (2010)
  • P. Beerli et al.

    Unified framework to evaluate panmixia and migration direction among multiple sampling locations

    Genetics

    (2010)
  • M. Beaumont

    In defence of model-based inference in phylogeography

    Mol. Ecol.

    (2010)
  • A. Camargo

    Phylogeography of the frog Leptodactylus validus (Amphibia: Anura): patterns and timing of colonization events in the Lesser Antilles

    Mol. Phylogenet. Evol.

    (2009)
  • I. Phillipsen et al.

    Phylogeography of a stream-dwelling frog ( Pseudacris cadaverina) in southern California

    Mol. Phylogenet. Evol.

    (2009)
  • H. Gante

    Diversification within glacial refugia: tempo and mode of evolution of the polytypic fish Barbus sclateri

    Mol. Ecol.

    (2009)
  • J. Felsenstein

    Phylogenies and the comparative method

    Am. Nat.

    (1985)
  • P. Harvey et al.

    The Comparative Method in Evolutionary Biology

    (1991)
  • A. Templeton

    Nested clade analyses of phylogeographic data: testing hypotheses about gene flow and population history

    Mol. Ecol.

    (1998)
  • M. Panchal et al.

    The automation and evaluation of nested clade phylogeographic analysis

    Evolution

    (2007)
  • A. Templeton

    A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. III. Cladogram estimation

    Genetics

    (1992)
  • M. Clement

    TCS: a computer program to estimate gene genealogies

    Mol. Ecol.

    (2000)
  • D. Posada et al.

    Intraspecific gene genealogies: trees grafting into networks

    Trends Ecol. Evol.

    (2001)
  • A. Templeton

    A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophilia

    Genetics

    (1987)
  • A. Templeton

    A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. IV. Nested analyses with cladogram uncertainty and recombination

    Genetics

    (1993)
  • I. Cassens

    Evaluating intraspecific “network” construction methods using simulated sequence data: do existing algorithms outperform the global maximum parsimony approach?

    Syst. Biol.

    (2005)
  • L. Knowles

    Why does a method that fails continue to be used?

    Evolution

    (2008)
  • A. Templeton

    A maximum likelihood framework for cross validation of phylogeographic hypotheses

  • M. Panchal et al.

    Evaluating nested clade phylogenetic analysis under models of restricted gene flow

    Syst. Biol.

    (2010)
  • K. Wong

    Alignment uncertainty and genomic analysis

    Science

    (2008)
  • B. Redelings et al.

    Robust inferences from ambiguous alignments

  • M. Suchard

    Bayesian selection of continuous-time Markov chain evolutionary models

    Mol. Biol. Evol.

    (2001)
  • B. Redelings et al.

    Joint Bayesian estimation of alignment and phylogeny

    Syst. Biol.

    (2005)
  • A. Novák

    StatAlign: an extendable software package for joint Bayesian estimation of alignments and evolutionary trees

    Bioinformatics

    (2008)
  • A. Drummond

    Bayesian coalescent inference of past population dynamics from molecular sequences

    Mol. Biol. Evol.

    (2005)
  • V. Minin

    Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics

    Mol. Biol. Evol.

    (2008)
  • Cited by (88)

    • Emergence of an early SARS-CoV-2 epidemic in the United States

      2021, Cell
      Citation Excerpt :

      Prior to Mardi Gras, our analyses demonstrated that Texas is more than twice as likely as the next most probable state to be the source of SARS-CoV-2 lineages in New Orleans, while SARS-CoV-2 in Shreveport likely originated from New Orleans itself (Figures 4A and 4B). Although these analyses point to Texas as a likely source of the Louisiana clade, our phylogeographic inference is limited by geographic and temporal sampling (Bloomquist et al., 2010). Therefore, we also investigated movement between New Orleans, Shreveport, and other U.S. states by analyzing human mobility patterns.

    • Unequal sisters – Past and potential future range development of Anatolian and Hyrcanian brown frogs

      2021, Zoology
      Citation Excerpt :

      We also changed the a priori distribution of the parameter ucld.mean into a uniform prior ranging from 0 to 1. In Beast, the phylogeographic history is reconstructed by the inference of the geographic history of genetic lineages (Bloomquist et al., 2010), so we linked each sequence to a geographic coordinate (Table S1). With the RRW option in Beast, dispersal rates are allowed to vary across branches, so we used this model to perform a RRW phylogeographic analysis to reconstruct the ancestral ranges of the Anatolian and Hyrcanian brown frog lineages.

    • HIV Rebound Is Predominantly Fueled by Genetically Identical Viral Expansions from Diverse Reservoirs

      2019, Cell Host and Microbe
      Citation Excerpt :

      Visualizing genetic similarity using haplotype networks further confirmed the diverse origins of the rebound viruses (Figure S2). To further unravel to what extent specific compartments can act as the source of rebound viruses, we employed a phylogeographic approach that is commonly used in molecular epidemiological research (Bloomquist et al., 2010; Faria et al., 2011) and quantified the rebound virus emergence events from each cell subset. The estimated number of rebound virus emergence events for 3 participants are summarized in a radar plot in Figure 3D and across participants in Figure S4.

    • Time-scaled phylogeography of complete Zika virus genomes using discrete and continuous space diffusion models

      2019, Infection, Genetics and Evolution
      Citation Excerpt :

      These models allow a more realistic reconstruction of spatial movements because, unlike discrete models, they do not necessarily need the ancestral location to be represented in the sampling location set (Lemey et al., 2010). The differences between discrete and continuous phylogeographic models have been efficiently described in previous reviews (Bloomquist et al., 2010; Faria et al., 2017). The aim of this study was to infer the origin and dispersion routes of ZIKV in the world using a classical discrete method and to reconstruct the recent epidemic in the Americas using a continuous phylogeographical method to better describe the local spread of ZIKV and to make hypothesis about the eco/epidemiology of the virus.

    View all citing articles on Scopus
    View full text