The Challenges and Impact of Human Genome Research for Minority Communities
from a conference presented by
The Human Genome Project recently reached a milestone with the completion and public release of a first draft version of a reference human DNA sequence in June of 2000. This achievement represents not only the accomplishment of one of the primary goals of the nearly decade old Genome Project, but also stands as the culmination of a great tradition of scientific research in the fields of genetics and molecular biology. The "genome" is the collection of all of the DNA within an organism, and the discovery of the "genes" contained within the genome can allow us to begin to understand both much about ourselves, our development, and factors that can effect our health and welfare.
In some sense, the history of genetics and genomics can be traced back to the work of Gregor Mendel, who analyzed the transmission of traits in plant hybrids in the late 1800’s. Mendel discovered that traits (or what he calls "genes") pass from one generation to the next with precise mathematical relationships, a discovery that laid the basis for our current understanding of genetics and heredity. However, the mechanism by which genetic information is stored and transmitted was not understood and required significant additional scientific investigation. Charles Darwin provided other crucial insights in the development of genetics when, in "On the Origin of Species," he postulated that genetic changes could arise spontaneously and that these could be passed from one generation to the next.
However, it was the publication of the structure of DNA by James Watson and Francis Crick in 1943 that truly ushered in a new era in biology, setting the stage for the molecular understanding of genetics cellular physiology, metabolism, and evolution. Watson and Crick’s structure provided an intellectual framework in which one could understand both how hereditary information could be reliably passed between generations while also allowing a mechanism by which changes in the "code" could arise, giving rise to variation. Further, DNA, and sister molecule RNA, allowed an explanation for how the information stored in genes (regions of the DNA) could be turned into proteins. As proteins serve as the primary building blocks of organisms and their cells, one could then understand how changes in DNA could lead to heritable changes in organisms, leading to the variation that Darwin observed.
DNA consists of four basic nucleotide sub-units, or bases, Adenine (A), Cytosine (C), Guanine (G), and Thiamine (T), linked together as a linear polymer. DNA, however, rarely occurs as a single polymer. Rather, it occurs as a double stranded molecule in which two "complementary" strands pair together such that As from one strand always pairs with Ts from the second (and vice versa) while Gs and Cs always pair. Although DNA consists of only a four-letter alphabet, linear combinations of As, Cs, Gs and Ts can encode a tremendous quantity of useful information, including the instructions that cells need to make proteins. Watson and Crick’s discovery led to a rapid series of discoveries, including a detailed understanding of transcription and translation, the mechanism by which the DNA blueprint is first converted into RNA, and the RNA message is used to make a distinct protein. What is fascinating about this process is that both the basic mechanisms, as well as the "code" that is used to convert the DNA blueprint into a protein, are nearly universal among all forms of life on earth.
This observation, and the data on genes and genomes that we have generated has allowed us to learn a number of important lessons both about our shared genetic heritage with all life on earth as well as how closely related all humans are to each other. Empirical data now tells us that the difference in the DNA sequence between any two individuals – regardless of race, sex, national origin – is less than 1 base per thousand. The fact that we are 99.9% similar at our most basic level (in our genes and DNA) may seem surprising given the apparent diversity of human morphology (differences in height, sex, hair color, skin color, and body shape.) However we can easily understand this if we consider that we can all eat the same food, breathe the same air, have children together, and even exchange blood or transplanted organs.
In the 1970’s, Fredrick Sanger and the team of Alan Maxam and Walter Gilbert independently discovered means of sequencing DNA, or reading off the series of As, Cs, Gs, and Ts that make up the genetic code. With this technique, one could envision reading the sequence of one gene, or many genes, or even the entire DNA complement, or genome, or an organism. Other novel techniques for the molecular analysis of DNA, RNA, and proteins have followed rapidly. These rapid developments in laboratory techniques, coupled with advances in computational approaches to data analysis, resulted in the creation of a new science know as "genomics." The goal of genomics is to rapidly sequence the entire genome of an organism to provide a starting point for further investigation. The first genome sequence of a first free-living organism. Haemophilus influenzae (a bacteria that causes ear infections in children, was completed in 1995 by Robert Fleischman, J. Craig Venter, and some of my other colleagues at TIGR. Since then, there has been an explosion in the number of unicellular prokaryotes (bacteria and archaea) and more complex eukaryotes (organisms, like humans, whose cells contain nuclei) that have been fully sequenced. Indeed, the announcement this summer that a working draft of the human genome had been completed was a clear signal that genomic science had reached maturity.
One might ask how and why genomics is important. To answer those questions, we have to first understand the importance of the role DNA plays. As mentioned previously, the DNA contains a blueprint that the cells use for making proteins. Changes in that DNA blueprint can lead to changes in the proteins that are made, and these changed proteins may not properly carry out their functions, leading to the development of a disease. For example, in sickle cell anemia, a single nucleotide change in the beta-globin gene leads to a form of the beta-globin protein that, under conditions of oxygen stress (low oxygen concentrations due to stress or exertion), can deform, causing the normally round blood cells to "sickle," impeding blood flow and causing severe pain and in some extreme cases, even death.
Part of our goal as scientists is to develop techniques that allow us to identify genes that play roles in disease with the hope that we can first provide information to people at risk so they can, for example, change their lifestyle to lessen their risk. Later, we hope that information will allow us to understand the mechanism responsible for the disease and allow us to develop treatments.
However finding disease genes has been a difficult task. The human genome contains approximately 3,000,000,000 base pairs of DNA (in two copies, for a total of 6,000,000,000 base pairs. To put this in perspective, 3,000,000,000 is approximately the number of seconds in 95 years. Within that sequence, we first have to find one of 30,000-100,000 genes within the 46 chromosomes (22 autosome pairs plus the X and Y chromosomes) and then identify the "normal" and "mutant" forms of the gene. This is complicated by the natural variation (polymorphism) in the genome that distinguishes individuals. Fortunately, we have developed techniques that allow us to "zoom into" the genome, building maps of higher and higher resolution, until ultimately, we discover the DNA sequence and the gene involved in the disease.
As an analogy to the techniques we use, consider how an alien species might try to find, for example, the New York Knicks basketball team. Upon arriving in the solar system, they would first learn that the Knicks play somewhere on earth. As they approached earth, they would be able to make simple maps of the earth, but those maps would get better and better as they got closer to the earth. Eventually, the aliens would learn that the Knicks were somewhere on the island of Manhattan near Central Park, where they could then land and begin a detailed search.
Finding disease genes is similar. First, we identify on which chromosome the disease gene lies, often through studies of families in which the disease is prevalent. We then establish landmarks on the chromosome, building a simple map. As we gather more information our maps become more detailed, until we find a convenient landmark that is closely associated with the disease. We can then focus on that area of the genome, ultimately obtaining the DNA sequence and identifying the gene.
While scientists have been searching for disease genes, one at a time, using these techniques for years, it was only in the late 1980’s that we began to realize that finding genes would be easier if we could complete all of these steps for all human genes at once. The Human Genome Project was born at the Department of Energy (which has a long history of studying genetics), and later joined by the National Institutes of Health and by organizations around the world. The first goal of the Genome Project was to build comprehensive maps of the human genome, placing useful landmarks along the chromosomes. Having accomplished this task, the sequencing of the genome began with the goal of producing a completed genome by 2003. Spurred on by competition from the private sector, the schedule for completing the task was pushed forward and due to the efforts of scientists around the world, we saw the announcement of the completion of the "first draft" of the human genome sequence in the summer of 2000. Although much work remains to be done to fill the gaps and finish this sequence, we now have a tremendous resource for gene discovery that will provide the starting point for much of the biology and medicine of the future.
The Human Genome Project has, in essence, provided us with our first glimpse of the "Book of Life." Our tasks now are no less challenging. We want to "mine" the DNA sequence for the presence of genes. We have to identify the functions of these genes. We want to use our tools and resources to map genes that play a role in human disease. We want to find the same genes in animal models such as a mouse and a rat so that we can continue our studies of gene function. Be we also have to address a number of complex social, ethical and legal issues associated with genome data.
Our primary tool to address these issues is education. For example, many genes involved in disease are found by studying inheritance patterns in populations where a particular disease is prevalent. The discovery of a mutated gene in such a population is often misinterpreted as implying that that finding is only relevant to that particular population, or that that mutation is somehow linked to that group of people. In fact, what we should learn from genomics is that genes are universal and that the same mutation is likely to contribute to the disease in many other people.
We not only have to educate ourselves, however. It is important that we educate our legislators and government policy makers. The genetic information is extremely private information about each individual and we must work to assure that our privacy is respected and that genetic information is not misused. We can already identify people who are at risk for developing certain diseases that have a strong genetic component. We have to now make sure that such information is not used to deny those individuals, employment, job training, or medical insurance. These responsibilities are shared by all of us. Through our continued efforts to educate ourselves, to reach out to our communities, and to communicate our fears, needs, and responsibilities to government policy makers, we have our best opportunity to have genetic and genomic information used to its greatest potential to provide a better quality of life for all people.
|The online presentation of this publication is a special feature of the Human Genome Project Information Web site.|