Beyond the Identification of Transcribed
Sequences:
Functional and Expression Analysis
11th Annual Workshop
November 9-12, 2001
Washington D.C.
Winston Hide
SA national Bioinformatics Institute
University Western cape
Private bag x17
Bellville
South Africa
telephone: +27 21 959 3645
fax: +27 21 959 2512
email: winhide@sanbia.c.za
prestype: Platform
presenter: Winston Hide
Winston Hide, Tzu-Ming Chern, Janet Kelso and Vladimir Babenko
Completion of the human genome sequence provides evidence for a gene count with lower bound 30 000 40 000. Significant protein complexity may derive in part from multiple transcript isoforms. Recent EST based studies have revealed that alternate transcription, including alternative splicing, polyadenylation and transcription start sites, occurs within at least 30-40 % of human genes. Transcript form surveys have yet to integrate the genomic context, expression, frequency, and contribution to protein diversity of isoform variation. We describe the degree to which protein coding diversity may be influenced by alternate expression of transcripts. 545 genes have been studied in this first intensive hand-curated assessment of exon skipping on chromosome 22. Combining manual assessment with software screening of exon boundaries provides a highly accurate and internally consistent indication of skipping frequency. 57 of 62 exon skipping events occur in the protein coding regions of 52 genes. A single gene, (FBXO7) expresses an exon repetition. 59% of highly represented multi-exon genes are likely to express exon-skipped isoforms in ratios that vary from 1:1 to 1:>100. The proportion of all transcripts corresponding to multi-exon genes that exhibit an exon skip is estimated to be 5%. A comparison with mouse orthologous genes reveals that common skipping events are not frequently detected, but that the frequency of skipping is similar between mouse and man. Comparitive ssessment of expression state and skip occurrence is discussed.