|Genome Mapping Section
DOE Human Genome Program Contractor-Grantee Workshop
65. Characterization of a BAC Clone Resource for Human Genomic Sequencing: Analysis of 150 Mb of Human STCs and Implications for Human Genomic Sequencing
G. G. Mahairas, J. C. Wallace, J.
Furlong, K Smith, S. Swartzell, A. Keller, HTSC Staff and L. Hood
Together with The Institute for Genomic Research (TIGR), we have sequenced the BAC ends or sequence tagged connectors (STCs) from 160,000 BAC clones. We have also generated a HinDIII restriction digest for each BAC whose end sequences have been determined at the University of Washington and developed strategies and tools for using this resource in support of large-scale genomic sequencing. We have demonstrated proof of concept for its use. Together with TIGR, we propose to complete the characterization of an STC clone resource from two IRB-approved human BAC libraries to 22.5-fold clone (BAC) coverage (e.g. 450,000 BAC clones assuming an average insert size of 150 kb). These data are available on the world wide web through dbGSS and our web sites (www.genome.washington.edu and www.tigr.org) and the clones are available for distribution to the scientific community through Research Genetics. Nine hundred thousand STC sequences will provide a sequence marker of 300 to 500 base pairs (bp) on average every 3,100 bp across the genome. The BAC libraries and the data pertaining to them will enable the facile selection of minimum tiling paths of BAC clones across each of the human chromosomes for large-scale sequence analysis. Here we present data to support the STC approach for sequencing of the human genome and other moderate to large genomes. The STC approach eliminates the need for up front physical mapping and uses BAC clones as the basic sequencing reagent. The major advantages of the STC approach are: (i) reduced cost and effort to obtain complete low and high resolution maps and front end automation is greatly simplified. (ii) The BAC clones are readily available through Research Genetics. (iii) As improved techniques for generating BACs or other yet to be developed libraries appear, reasonable numbers of these new clones could easily be added to the database and clone collection. (iv) This approach will obviate the significant problem of closure for high resolution physical mapping. (v) The existing chromosomal landmarks, STS, PCR-specific sites, EST, or partial cDNA sequence, can be easily placed on the BAC clones, adding additional markers for BAC clones and taking significantly advantage of any associated biological information. (vi) The 10% of the genome obtained in the STCs can be searched against the sequence data base to identify many interesting landmarks (e.g. genes, STSs, EST, etc.) that could locate the BAC clone on the preexisting chromosomal maps. (vii) Chromosomal regions of key biological interest can be identified and sequenced first. (viii) The human genome can be sequenced earlier and for less cost. (ix) The STC approach will provide useful clones for biological studies even at the very early STC sequencing stages when only 3- to 4-fold coverage is achieved. The STC approach streamlines the task of clone selection by doing much of the work up front and by using sequence alignment and computers as the primary tools to identify sequencing targets. Additional major advantages of the STC strategy are that it is rapid in that clone selection is automated, STC data directly correlates with a clone which can be used for shotgun sequencing without further evaluation, surveys the entire genome and is more dense allowing greater versatility in the use of the data including genotyping analysis. Perhaps the greatest advantage of the STC resource is that it can be used by any investigator for clone or sequencing target selection via the World Wide Web. The STC clone library also serves as a large scale genomic survey tool and provides access to many characterized clones in any part of the genome. The implications of this type of resource transcend simple genomic sequencing. Additionally, we will describe the University of Washington High Throughput Sequencing Facility capable of producing 2 million BAC end sequences per year.
|Author Index||Sequencing Technologies||Microbial Genome Program|
|Search||Mapping||Ethical, Legal, & Social Issues|