DOE Human Genome Program Contractor-Grantee
3. Large-Scale Finishing of Human and Mouse Genomic Sequences
Richard M. Myers, Jeremy Schmutz, Jane Grimwood, the Sequencing Group at Stanford Human Genome Center, and the Joint Genome Institute
The Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5120 and the Joint Genome Institute, 2800 Mitchell Drive, B100, Walnut Creek, CA 94598
We have begun a new collaboration with the the Joint Genome Institute to generate large amounts of finished human and mouse sequence from "draft" sequences produced by the JGI and its associated laboratories. Our goal is to produce about 100 Mb of finished sequence each year, focusing first on finishing human chromosome 19 while also finishing clones from human chromosomes 5 and 16 and syntenic mouse sequences. Our criteria for considering a large-insert clone as finished is that it has an estimated base-pair error rate of less than one in 10,000 bp, and that the entire sequence is contiguous, with exception for small, difficult-to-fill gaps of known size in a small fraction of the clones.
We receive subclones and sequence traces from the JGI and reassemble the data, generally resulting in assemblies with 10-20 contigs per 100 kb for the 6X of shotgun sequence data produced for each clone. We then use a computationally-driven process that requires almost no human decision-making to choose subclones, directed sequencing reactions, and, for a portion of the reactions, oligonucleotide primers. After applying this automated stage, all or almost all of the gaps are filled and the clone is passed to a group of finishers, who reassemble the sequence data and design specialized sequencing reactions to fill remaining gaps and to bring up the quality of the sequence so that the entire clone meets our finished criteria. The final sequence is checked by our informatics group and then submitted to GenBank. We have finished more than 3 Mb of sequence with the JGI since this collaboration began, and expect to have an additional 8 Mb finished by the end of February. We hope to achieve an average of 8-10 Mb of finished sequence per month within the next quarter.
|The online presentation of this publication is a special feature of the Human Genome Project Information Web site.|