DOE Genomes
-

Human Genome Project Information


Archive

logo

DOE Human Genome Program Contractor-Grantee Workshop IV

Santa Fe, New Mexico, November 13-17, 1994

PDF

Introduction to the Workshop
URLs Provided by Attendees

Abstracts
Mapping
Informatics
Sequencing
Instrumentation
Ethical, Legal, and Social Issues
Infrastructure
 

The electronic form of this document may be cited in the following style:
Human Genome Program, U.S. Department of Energy, DOE Human Genome Program Contractor-Grantee Workshop IV, 1994.

Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected.

Prediction of Coding Regions in Genomic DNA: Optimal and Suboptimal Parses

Eric E. Snyder and Gary D. Stormo
Department of Molecular, Cellular and Developmental Biology
University of Colorado, Boulder, CO 80309

We have developed an approach for predicting coding regions in genomic DNA that utilizes multiple types of evidence, combines those into a single scoring function and then returns both optimal and ranked suboptimal solutions using that scoring function. The current version of the program predicts four classes of sequence: introns and three types of exons, first, last and internal. It uses a variety of statistical tests for these different classes, including those for the signals that define their ends and for baises in their contained sequences. A neural network is used to weight the different types of statistical tests to optimize performance, which we find to be as good or better than other published methods when tested on new examples. However, we find one of the most important features of this system is the ability to examine multiple solutions which is provided by the dynamic programming approach. These multiple, ranked solutions often provide indications of which portions of the predictions are most reliable and in cases where the highest scoring prediction is not correct it can often be found in a high ranking suboptimal solution. Furthermore alternative splicing patterns can often be found among the high ranking suboptimal solutions. We have performed tests of the robustness of the method when there are sequencing errors in the data, and shown that the system can be trained to optimize performance for data with specified error rates. We are now exploring methods for reliably predicting other classes of sequence regions, especially promoters. These include approaches based on minimal length encoding algorithms and on Sequence Landscape methods. Recent results from these approaches will be described.


Last modified: Wednesday, October 29, 2003

Home * Contacts * Disclaimer

Document Use and Credits
Publications and webpages on this site were created by the U.S. Department of Energy Genome Program's Biological and Environmental Research Information System (BERIS). Permission to use these documents is not needed, but please credit the U.S. Department of Energy Genome Programs and provide the website http://genomics.energy.gov. All other materials were provided by third parties and not created by the U.S. Department of Energy. You must contact the person listed in the citation before using those documents.

Base URL: www.ornl.gov/hgmis

Site sponsored by the U.S. Department of Energy Office of Science, Office of Biological and Environmental Research, Human Genome Program