|Genome Informatics Section
DOE Human Genome Program Contractor-Grantee Workshop
105. Sequence Landscapes
Gary D. Stormo, Samuel Levy, and
Sequence Landscapes are a graphical display of the word frequencies from a database (DB) for every word of every length in a target sequence (TS) [see Levy et al. Bioinformatics 14: 74-80, 1998]. If the TS and the DB are the same sequence this is a convenient method to detect all of the repeated sequences, of any length. However, we have been exploring the use of this approach for classifying regions of DNA sequence into functional domains, such as exons, introns, promoters, etc. Using DB from each class, the landscapes can be used to derive likelihoods that every region of the sequence belongs to each possible class. We think information can be combined with other types of information to help provide improved recognition algorithms. We are especially interested now in improving methods for determining promoter regions and transcription initiation sites. The information in the landscape can also be very useful for determining the best oligos to use on DNA chips. One of the criteria to be used in choosing the best oligos are those that are most specific for the gene being assayed. Therefore one would like to pick, for each, the oligo which has the most mismatches to the most similar other sites in the genome. This can be accomplished easily and efficiently with the landscape information. We return a list of candidate oligos which can then be ranked by other criteria, including hybridization energy and TM.
|Author Index||Sequencing Technologies||Microbial Genome Program|
|Search||Mapping||Ethical, Legal, & Social Issues|