Skip to main content
Log in

Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Two approximate methods are proposed for maximum likelihood phylogenetic estimation, which allow variable rates of substitution across nucleotide sites. Three data sets with quite different characteristics were analyzed to examine empirically the performance of these methods. The first, called the “discrete gamma model,” uses several categories of rates to approximate the gamma distribution, with equal probability for each category. The mean of each category is used to represent all the rates falling in the category. The performance of this method is found to be quite good, and four such categories appear to be sufficient to produce both an optimum, or near-optimum fit by the model to the data, and also an acceptable approximation to the continuous distribution. The second method, called “fixed-rates model,” classifies sites into several classes according to their rates predicted assuming the star tree. Sites in different classes are then assumed to be evolving at these fixed rates when other tree topologies are evaluated. Analyses of the data sets suggest that this method can produce reasonable results, but it seems to share some properties of a least-squares pairwise comparison; for example, interior branch lengths in nonbest trees are often found to be zero. The computational requirements of the two methods are comparable to that of Felsenstein's (1981, J Mol Evol 17:368–376) model, which assumes a single rate for all the sites.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Best DJ, Roberts DE (1975) The percentage points of the ζ2 distribution. Appl Statist 24:385–388

    Google Scholar 

  • Bhattacharjee GP (1970) The incomplete gamma integral. Appl Statist 19:285–287

    Google Scholar 

  • Brown WM, Prager EM, Wang A, Wilson AC (1982) Mitochondrial DNA sequences of primates, tempo and mode of evolution. J Mol Evol 18:225–239

    Google Scholar 

  • Cavalli-Sforza LL, Edwards AWF (1967) Phylogenetic analysis: models and estimation procedures. Evolution 32:550–570

    Google Scholar 

  • Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376

    Google Scholar 

  • Fitch WM (1986) The estimate of total nucleotide substitutions from pairwise differences is biased. Philos Trans R Soc Lond Biol 312: 317–324

    Google Scholar 

  • Fitch WM, Margolish E (1967) A method for estimating the number of invariant amino acid coding positions in a gene, using cytochrome c as a model case. Biochem Genet 1:65–71

    Google Scholar 

  • Fitch WM, Markowitz E (1970) An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem Genet 4:579–593

    Google Scholar 

  • Goldman N (1993) Statistical tests of models of DNA substitution. J Mol Evol 36:182–198

    Google Scholar 

  • Hasegawa M (1991) Molecular phylogeny and man's place in Hominoidea. J Anthrop Soc Nippon 99:49–61

    Google Scholar 

  • Hasegawa M, Horai J (1991) Time of the deepest root for polymorphism in human mitochondrial DNA. J Mol Evol 32:37–42

    Google Scholar 

  • Hasegawa M, Kishino H (1989) Confidence limits on the maximum likelihood estimation of the hominoid tree from mitochondrial DNA sequences. Evolution 43:672–677

    Google Scholar 

  • Hasegawa M, Kishino H, Yano T (1985) Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22: 160–174

    Google Scholar 

  • Hasegawa M, Rienzo AD, Kocher TD, Wilson AC (1993) Toward a more accurate time scale for the human mitochondrial DNA tree. J Mol Evol 37:347–354

    Google Scholar 

  • Holmquist R, Goodman M, Conry T, Czelusniak I (1983) The spatial distribution of fixed mutations within genes coding for proteins. J Mol Evol 19:137–448

    Google Scholar 

  • Jin L, Nei M (1990) Limitations of the evolutionary parsimony method of phylogeny analysis. Mol Biol Evol 7:82–102

    Google Scholar 

  • Kocher TD, Wilson AC (1991) Sequence evolution of mitochondrial DNA in humans and chimpanzees: Control region and a protein-coding region. In: Osawa S, Honjo T (eds) Evolution of life: fossils, molecules, and culture. Springer-Verlag, Tokyo, pp 391–413

    Google Scholar 

  • Li W-H, Gouy M, Sharp PM, O'hUigin C, Yang Y-W (1990) Molecular phylogeny of rodentia, lagomorpha, primates, artiodactyla, and carnivora and molecular clocks. Proc Natl Acad Sci USA 87: 6703–6707

    Google Scholar 

  • Navidi WC, Churchill GA, von Haeseler A (1991) Methods for inferring phylogenies from nucleotide acid sequence data by using maximum likelihood and linear invariants. Mol Biol Evol 8:128–143

    Google Scholar 

  • Nei M, Gojobori T (1986) Simple methods for estimating the number of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3:418–426

    Google Scholar 

  • Palumbi SR (1989) Rates of molecular evolution and the function of nucleotide positions free to vary. J Mol Evol 29:180–187

    Google Scholar 

  • Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10:512–526

    Google Scholar 

  • Thorne JL, Kishino H, Felsenstein J (1992) Inching toward reliability: an improved likelihood model of sequence evolution. J Mol Evol 34:3–16

    Google Scholar 

  • Uzzell T, Corbin KW (1971) Fitting discrete probability distributions to evolutionary events. Science 172:1089–1096

    Google Scholar 

  • Wakeley J (1993) Substitution rate variation among sites in hypervariable region 1 of human mitochondrial DNA. J Mol Evol 37:613–623

    Google Scholar 

  • Yang Z (1993) Maximum likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol 10:1396–1401

    Google Scholar 

  • Yang Z (in press) Estimating the pattern of nucleotide substitution. J Mol Evol

  • Yang Z, Wang T (in press) Mixed model analysis of DNA sequence evolution. Biometrics

  • Yang Z, Goldman N, Friday AE (1994) Comparison of models for nucleotide substitution used in maximum likelihood phylogenetic estimation. Mol Biol Evol 11:316–324

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods. J Mol Evol 39, 306–314 (1994). https://doi.org/10.1007/BF00160154

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00160154

Key words

Navigation