What is it?
the New Genetics
Websites en Español
Primer Molecular Genetics
List of All Publications
Search This Site
Site Stats and Credits
The Human Genome Project was completed in 2003. One of the primary research
areas was DNA sequencing. This page details that research.
The HGP's emphasis was on obtaining a complete and
highly accurate reference sequence (1 error in 10,000 bases), largely continuous
across each human chromosome. Scientists believe that knowing this sequence
is critically important for understanding human biology and for applications
to other fields.
A"working draft" of the human genome DNA sequence was completed in June 2000, published February 2001. The
working draft comprises shotgun sequence data from mapped clones, with
gaps and ambiguities unresolved. Draft sequence provides a foundation
for obtaining the high-quality finished sequence and also is a valuable
tool for researchers hunting disease genes. See
Feb. 2001 and April 2003 Science
and Nature papers analyzing the sequence.
Human DNA Sequence Goals
- Achieve coverage of at least 90% of the genome in a working draft based on mapped clones
by the end of 2001.
- Finish one-third of the human DNA sequence by the end of 2001.
- Finish the complete human genome sequence by the end of 2003.
- Make the sequence totally and freely accessible.
A goal also focused on identifying individual variations in the human genome.
Although more than 99% of human DNA sequences are the same across the
population, variations in DNA sequence can have a major impact on how
humans respond to disease; environmental insults such as bacteria, viruses,
toxins, and chemicals; and drugs and other therapies.
Methods have been developed to detect different types of variation,
particularly the most common type called single-nucleotide polymorphisms
(SNPs), which occur about once every 100 to 300 bases. SNP maps are helping
scientists identify the multiple genes associated with such complex diseases
as cancer, diabetes, vascular disease, and some forms of mental illness.
These associations are difficult to establish with conventional gene-hunting
methods because a single altered gene may make only a small contribution
to disease risk.
Human Genome Sequence Variation Goals
- Develop technologies for rapid, large-scale identification and scoring
of single-nucleotide polymorphisms and other DNA sequence variants.
- Identify common variants in the coding regions of the majority of
identified genes during this 5-year period.
- Create a SNP map of at least 100,000 markers.
- Develop the intellectual foundations for studies of sequence variation.
- Create public resources of DNA samples and cell lines.
Text adapted from F. Collins, Ari Patrinos, et al., "New Goals for
the U.S. Human Genome Project: 1998–2003," Science
282: 682-689 (1998). For a more detailed explanation of sequencing,
see the U.S. DOE Primer on
Molecular Genetics. See HGP
Goals for more details on the project's goals.
||95% of gene-containing part of human sequence finished to 99.99% accuracy
gene-containing part of human sequence finished to 99.99% accuracy
|Capacity and Cost of Finished Sequence
||Sequence 500 Mb/year at < $0.25 per finished base
Mb/year at <$0.09 per finished base
|Human Sequence Variation
||100,000 mapped human SNPs
||3.7 million mapped human SNPs
Web Sites for Accessing HGP Sequence
Sequence data from both ends of mapped BAC (bacterial artificial chromosomes)
clones provide researchers with a series of markers spaced approximately
every 3000 to 4000 bases across the genome. Researchers use these markers
as "sequence tag connectors" (STCs) to identify the specific clones needing
to be sequenced to extend sequenced regions further along the chromosomes
and for other uses in large-scale sequencing efforts.
Related Articles from Human Genome