Genome Mapping Section 

DOE Human Genome Program Contractor-Grantee Workshop VII 
January 12-16, 1999  Oakland, CA


66. Human BAC End Sequencing 

Shaying Zhao, Mark Adams, Bill Nierman, and Joel Malek 
TIGR, The Institute for Genomic Research, 9712 Medical Center Drive, Rockville MD 20850 
szhao@tigr.org 

BAC end sequences (BESs) provide highly specific markers. In genomic sequencing, the clones to be sequenced next can be selected by searching the completed sequence against a BES database. The average insert size of BAC clones is about 150 kb and therefore BESs are useful in chromosomal walking and assembly. End sequences from 300,000 clones (15x clone coverage) will be generated by TIGR and UofWashington. At TIGR, we have sequenced BESs from both CalTech and Pieter de Jong libraries with a successful rate >80% and an average read length of 450. The pair percentage is >65% and the average phred score is 28. We also resequence both ends of one-end-failed clones for higher pair % and quality control. For those clones we resequenced so far, the redo sequences always match the original ones. The average cost is about $4.50 per BES and $0.10 per base. We continue improving our protocol to decrease the cost and increase the successful rate and read length. Up to date, we have submitted more than 130,000 BESs to GenBank. We have collected more than 300,000 BESs from TIGR, U of Washington and CalTech for our search database at http://www.tigr.org/tdb/ humgen/bac_end_search/bac_end_search.html and ftp site (ftp://ftp.tigr.org/pub/data/h_sapiens/ bac_end_sequences/). 

The finished 600,000 BESs will cover about 10% human genome and provide a sequence marker every 5 kb across the genome. BESs can be used to survey the whole genome. We searched BESs against existing databases of repeats, STSs and ESTs and the results are presented at our web site (http://www.tigr.org/tdb/humgen/bac_end_search/ bac_end_anno.html). On average 50% of BESs contain known repeats and the length ranges from 21 to 806 with an average of 185 bases. And 30% bases are repeats masked. With identity >=95%, 3 % BESs match ESTs while 0.2% match STSs which are used to locate some of the BACs on the preexisting chromosomal maps. BESs are also used to assess the representative of BAC libraries and tie up the existing contigs. We are collaborating with other institutes to map some of the BESs and the results will be presented on web. 


 
Home Sequencing Functional Genomics
Author Index Sequencing Technologies Microbial Genome Program
Search Mapping Ethical, Legal, & Social Issues
PDF Informatics Infrastructure