Genome Informatics Section 

DOE Human Genome Program Contractor-Grantee Workshop VII 
January 12-16, 1999  Oakland, CA


104. Computer Analysis of DNA Sequence Data to Locate SECIS Elements 

Michael Giddings, Olga Gurvich, Marla Berry, John Atkins, and Raymond Gesteland 
University of Utah, Department of Human Genetics, 15 N. 2030 E. rm. 6160, Salt Lake City, UT 84112-5330 
giddings@genetics.utah.edu 

The selenocysteine insertion sequence (SECIS) has been found in a number of organisms. In Eukaryotes it consists of a particular structure in the 3' untranslated region which modifies the behavior of ribosomal translation within the coding region, causing selenocysteine insertion at a UGA codon instead of the translational stop which would normally occur. SECIS elements vary in structure, but all contain several common elements. Known examples contain a core group of 4-5 nucleotides that do not conform to usual Watson-Crick pairing rules, as well as a stem-loop structure which contains a group of 2-3 adenosine nucleotides as either a bulge or member of the upper loop. 

Locating new SECIS elements through pattern searches in DNA databases poses some significant challenges. We are using a three-pronged approach to this problem. The first phase utilizes a fast, rough scan of large databases with Ross Overbeek's "patscan" software to narrow the field of possibilities. The second phase, currently being implemented, performs a refined analysis of the candidates, scoring and ranking them with a new algorithm in which neural network and fuzzy logic approaches are being explored. The third phase then performs visualization of the candidates, based on an algorithm that utilizes rules regarding the formation of SECIS elements to fold and display them for human analysis. 

We present the implementation details of this system in its current form, as well as initial results for scans of several genomic databases using this system. 

We would like to acknowledge the following supporters of this work: 

DOE Grant #DE-FG03-94ER61817, "Advanced Sequencing Technology" 

DOE Grant (no # assigned yet), "Genomic Analysis of the Multiplicity of Protein Products from Genes" 

NIH Genome Training Grant #T32HG00042 


 
Home Sequencing Functional Genomics
Author Index Sequencing Technologies Microbial Genome Program
Search Mapping Ethical, Legal, & Social Issues
Order a copy Informatics Infrastructure