Sequencing Human Chromosome 16: SAmple SEquencing (SASE) Analysis as a Framework for Identifying Genes and Complete Large-scale Genomic Sequencing*

Darrell O. Ricke, Judith M. Buckingham, A. Christine Munk, Rebecca Lobb, Elizabeth H. Saunders, Jingmei Liu, Norman A. Doggett, Michael R. Altherr, Larry L. Deaven, and Robert K. Moyzis.

Life Sciences Division and Center for Human Genome Studies, Los Alamos National Laboratory, Los Alamos, NM 87545.

The human chromosome 16 physical map (Doggett et al., Nature 377:Suppl:335-365, 1995; Doggett et al., this meeting) provides the ideal framework for sequencing a human chromosome. We are using a SAmple SEquencing (SASE) approach to rapidly generate aligned sequences along the chromosome 16 physical map. SASE analysis is a method for rapidly "scanning" large genomic regions with minimal cost, identifying, and localizing most genes. Briefly, individual cosmids are partially digested with Sau3A and 3 kb fragments are recloned into double-strand sequencing vectors. By sequencing both ends of a 1X sampling of these recloned fragments along with end sequences of the cosmid, 70% sequence coverage is achieved with 98% clone coverage. The majority of this clone coverage is ordered by the relationship between the subclone end sequences. These ordered sequences are ideal substrates for directed sequencing strategies (see Chi et al., this meeting). SASE analysis has been initiated on the 40 Mb short arm of chromosome 16. We propose to make chromosome 16 SASE sequences, along with feature annotation, publicly available through GSDB. Such data are sufficient to allow PCR amplification of the sequenced region from GSDB submissions alone, eliminating the need for extensive clone archiving and distributing. Therefore, SASE analysis provides the opportunity for numerous laboratories to complete the distributive genomic sequencing of chromosome 16. To identify and annotate regions of biological interest, we have developed the SCAN (Sequence Comparison ANalysis) algorithm to extract and identify significant homologies from database search results of SASE data. Initial SCAN results on the first 0.6 Mb of SASE data analyzed have identified multiple candidate genes and exons.

*This work is funded by USDOE under contract W-7405-ENG-36.


Abstracts scanned from text submitted for January 1996 DOE Human Genome Program Contractor-Grantee Workshop.

Return to Table of Contents