Progress, and Applications
of the Human Genome Project
Sponsored by the U.S. Department of Energy Human Genome Program
Human Genome News Archive Edition
Human Genome News, January 1998; 9:(1-2)
The Fifth International Conference on Intelligent Systems for Molecular Biology held June 21-25, 1997, in Porto Carras, Greece, ended with a workshop on Automatic Annotation of Genome Sequence Data.
Automatic annotation of large amounts of genomic DNA sequence clearly is and will continue to be a formidable challenge. When completed, the human genome sequence will consist of 24 strings of As, Ts, Cs, and Gs with a combined length of 3 billion characters. Without marking the locations of such biologically important parts of the sequence as the genes and their regulatory elements, this string of characters has little usefulness. Annotating the genome sequence in parallel with its determination is critical.
Attendees felt this problem will be addressed properly only by developing very efficient computational tools for initial sequence annotation, treating the annotations as hypotheses, and testing and verifying them in the laboratory. Additionally, for maximum usefulness, the generated annotation results must be stored in an easily retrievable and queryable form in well-curated databases. The "If you sequence it, the community will annotate it" approach is unlikely to produce desired results, and new paradigms and possibly new organizational models will be needed to present genomic sequence in its most useful form.
Eight workshop speakers addressed the challenges and technologies in automatic annotation and the most efficient division of labor between biology and computer science.
Introductory remarks by session chairman Chris Sander [European Molecular Biology Laboratory--European Bioinformatics Institute (EBI)] made clear that no one yet has the experience to know the right way to proceed with automatic annotation. Richard Durbin (Sanger Centre) stressed an often-repeated theme that proper annotation will require wet-laboratory work as well as computational annotation. He also stressed the need for curated databases. Michael Ashburner (EBI) discussed his experience in annotating Drosophila sequences and the need for hierarchial controlled vocabularies. He suggested the possibility of an annotation database that would be separate from but seamlessly linked to the sequence databases.
Three other speakers addressed general problems in genomic-sequence annotation: Antoine Danchin (Institut Pasteur) discussed annotation of the Bacillus subtilis genome, Terry Gaasterland (Argonne National Laboratory) described annotating microbial genomes, and Chris Overton (University of Pennsylvania) shared experiences from a project to annotate genomic sequence from human chromosome 22. Other speakers discussed annotation efforts and tools being developed in the bioinformatics industry. [Richard Mural, Life Science Division, Oak Ridge National Laboratory,email@example.com]
Back to Home Page
Back to Table of Contents
The electronic form of the newsletter may be cited in the following style:
Human Genome Program, U.S. Department of Energy, Human Genome News (v9n1).
Last modified: Wednesday, October 29, 2003
Home * Contacts * Disclaimer
Document Use and Credits
Publications and webpages on this site were created by the U.S. Department of Energy Genome Program's Biological and Environmental Research Information System (BERIS). Permission to use these documents is not needed, but please credit the U.S. Department of Energy Genome Programs and provide the website http://genomics.energy.gov. All other materials were provided by third parties and not created by the U.S. Department of Energy. You must contact the person listed in the citation before using those documents.
Base URL: www.ornl.gov/hgmis
Site sponsored by the U.S. Department of Energy Office of Science, Office of Biological and Environmental Research, Human Genome Program