DOE Human Genome Program Contractor-Grantee
150. A modular integrated system for high throughput DNA sequencing
Voula Kodoyianni, Ying Ge, Geoffrey K. Krummel, Jessica M. Severin, Michael S. Westphall, Michael T. Borchardt, Laura Grable, Anne S. Olsen2 and Lloyd M. Smith
Department of Chemistry, University of Wisconsin-Madison, Madison, WI and 2DOE Joint Genome Institute, Walnut Creek, CA
We have developed an integrated modular system for high throughput DNA sequencing. Our system has four major components: a) a robotic template preparation module which converts M13 containing bacterial cultures from random subclone libraries to ready-to-sequence DNA b) a robotic module for sequencing either M13 DNA using extension dye primer chemistry or plasmid DNA using Big Dye Terminator chemistry c) a custom made electrophoresis module and d) a software package (GelImager and Basefinder) that converts raw gel images to ready-to-assemble sequence data.
This system is being "hardened", meaning thoroughly debugged in production operation, in the context of a collaborative project with the DOE Joint Genome Institute (JGI) to sequence minimally overlapping, mapped BAC and cosmid clones that cover 1 Mb of H19q13.2 and its syntenic region in mouse. These DNA regions encode a cluster of zinc finger proteins. A detailed comparison of the human gene sequences to their murine orthologs will allow better understanding of the regulation, evolution and functional diversification of these proteins.
Our sequencing strategy can be divided into five stages: a) clone validation; to ensure no deletions have occurred; b) random shotgun library construction; two libraries are made for each BAC/cosmid, a single stranded library in M13 (1.2-2 kb inserts) and a plasmid library in pBC SK+ (2.5-5 kb inserts); c) shotgun sequencing; a 96-well format is used beginning with DNA purification, followed by sequencing of DNA template with dye primer (M13) or dye terminator (plasmids) chemistries,and gel electrophoresis; d) gel data processing (using GelImager and Basefinder) and sequence assembly using phrap; e) finishing the assembled reads to 99.99% accuracy using consed and autofinish and f) annotation before submission to Genbank.
We are currently sequencing 14 human clones (8 cosmids and six BACs) and 4 murine clones ( 4 BACs). Our progress in analyzing these clones will be presented.
DOE - DE-FG02-97ER62386 and NIH - R01HG01886.
|The online presentation of this publication is a special feature of the Human Genome Project Information Web site.|