Beyond the Identification of Transcribed
Sequences:
Functional and Expression Analysis
11th Annual Workshop
November 9-12, 2001
Washington D.C.
Prof. Michael Q. Zhang
Watson School of Biological Sciences
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
telephone: 516-367-8393
fax: 516-367-8461
email: mzhang@cshl.org
prestype: Platform
presenter: Michael Q Zhang
Ramana V. Davuluri,1 Yutaka Suzuki,2 Sumio Sugano2 and Michael Q. Zhang1
1Cold Spring Harbor Laboratory; 1 Bungtown Road; PO Box: 100; Cold Spring Harbor,
New York 11724; Tel: 516-367-6956; Fax: 516-367-8461; Email: {ramana, mzhang}@cshl.org
2Department of Virology; Institute of Medical Sciences; The University of Tokyo;
4-6-1, Shirokanedai, Minato-ku; Tokyo 108-8639, Japan
A non-redundant database of 2312 full-length human 5'UTRs was carefully prepared using state of art experimental and computational technologies. A comprehensive computational analysis of this data was conducted for characterizing the 5'UTR features. Classification and Regression Tree analysis was used to classify the data into three distinct classes. Class I consists of mRNAs that are believed to be poorly translated with long 5'UTRs filled with potential inhibitory features. Class II consists of TOP mRNAs that are regulated in growth dependent manner and Class III consists of mRNAs with favorable 5'UTR features that may help efficient translation. The most accurate tree we found has 92.5% classification accuracy as estimated by cross validation. The classification model included presence of TOP (terminal oligopyrimidine tract), secondary structure, 5'UTR length and presence of upstream AUGs (uAUGs) as the most relevant variables. The present classification and characterization of the 5'UTRs provide precious information for better understanding the translational regulation of human mRNAs. Furthermore, this database and classification can help people to build better computational models for predicting the 5'terminal-exon and separating the 5'UTR from the coding region.