DOE Genomes
-

Human Genome Project Information


Archive

logo

DOE Human Genome Program Contractor-Grantee Workshop IV

Santa Fe, New Mexico, November 13-17, 1994

PDF

Introduction to the Workshop
URLs Provided by Attendees

Abstracts
Mapping
Informatics
Sequencing
Instrumentation
Ethical, Legal, and Social Issues
Infrastructure
 

The electronic form of this document may be cited in the following style:
Human Genome Program, U.S. Department of Energy, DOE Human Genome Program Contractor-Grantee Workshop IV, 1994.

Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected.

Estimation of DNA Substitution Matrices and a Generalized Measure of Evolutionary Distance

W.J. Bruno and L. Arvestad
Theoretical Biology and Biophysics Group; T10, MS K710; Los Alamos National Laboratory; Los Alamos, New Mexico 87545.

When attempting to analyze the evolutionary history of a large number of DNA sequences, the speed and complexity of the algorithm used becomes important. Thus, distance-based algorithms of phylogeny reconstruction are of interest. However, the ability to discriminate among similar trees will sometimes depend on having a very accurate and sensitive distance measure. A distance measure based on an incorrect model for nucleotide substitution can lead to misleading results.

Yang [J. Mol. Evol. 39:105-111 (1994)] showed using a maximum likelihood analysis that currently used methods for distance estimation are based on incorrect models for nucleotide substitution. In particular, he noted that transverstion rates are different for different bases, beyond effects due to base frequency alone. Goldstein and Pollock [Theoretical Population Biology 45:219-226 (1994)] demonstrated that carefully combining Kimura-type measures [Kimura, J. Mol. Evol., 16:111-120 (1980)] of sequence distance based on transversions and transitions gives a robust distance measure that rivals maximum likelihood on simulated data.

We show that it is possible to estimate a substitution matrix without using likelihood methods on real data sets. The method works with small and with large data sets and executes in seconds. Once the substitution matrix is determined, its eigenvectors can be used to calculate a distance measure between pairs of sequences, consistent with the observed pattern of substitution. This method is shown to have good linearity and discrimination properties, and it requires minimal computation.

This work was funded by the DOE Human Genome Program (ERW-F137, R. Moyzis, P.I.). W.J.B. was supported by a DOE Human Genome Postdoctoral Fellowship, and is grateful to the Santa Fe Institute for its hospitality.


Last modified: Wednesday, October 29, 2003

Home * Contacts * Disclaimer

Document Use and Credits
Publications and webpages on this site were created by the U.S. Department of Energy Genome Program's Biological and Environmental Research Information System (BERIS). Permission to use these documents is not needed, but please credit the U.S. Department of Energy Genome Programs and provide the website http://genomics.energy.gov. All other materials were provided by third parties and not created by the U.S. Department of Energy. You must contact the person listed in the citation before using those documents.

Base URL: www.ornl.gov/hgmis

Site sponsored by the U.S. Department of Energy Office of Science, Office of Biological and Environmental Research, Human Genome Program