Beyond the Identification of Transcribed
Sequences:
Functional and Expression Analysis
11th Annual Workshop
November 9-12, 2001
Washington D.C.
Bernhard Korn
RZPD - Ressourcenzentrum für Genomforschung
INF 506
69120 Heidelberg
Germany
telephone: +49 (6221) 42 4700
fax: +49 (6221) 42 4704
email: korn@rzpd.de
prestype: Platform
presenter: Zink
D. Zink1,2, S. Haas3, E. Coward3, M. Vingron3, B. Korn1
1 RZPD Ressourcenzentrum für Genomforschung, INF 506, 69120 Heidelberg,
Germany
2 Deutsches Krebsforschungszentrum, INF 506, 69120 Heidelberg, Germany
3 Max-Planck Institut für Molekulare Genetik, Ihnestr. 73, 14195 Berlin,
Germany
Public EST databases currently contain more than 3 million human EST sequences, representing probably 40-50.000 human genes/transcripts. Within these data exists a large redundancy. We take advantage of this redundancy by analysing the differences of sequences belonging to the same gene. The EST sequences are clustered and assembled to a consensus sequence. However, many clusters cannot be assembled into a single consensus sequence. The EST sequences then fall into multiple consensus sequences (contigs) within one clusters. The differences might be due to imperfect sequence data (e.g. partially unspliced sequence templates, sequencing errors) or due to alternative splicing. Instead of one gene coding for one mRNA leading to one protein, alternative splicing of transcripts may lead to different mRNA species and therefore to potentially different proteins. Splice variants are often due to alternative exon usage, which we verify by RT-PCR. We have set up a medium throughput strategy that does allow us to screen expression of genes in 25 different human tissues of multiple stages. We initiated this project by analysing genes that predominantly reside within the Down-Syndrome critical region and on Chromosome 22. Our results indicate, that the theoretical data represented in EST databases can be verified in many cases by our experimental design. Moreover, we do find additional splice products, that are not defined by any EST sequence. In order to gain more insight, we re-sequence PCR products in question, to confirm their origin and nature. Nevertheless, in more than 18% of the cases, we cannot experimentally support EST data by RT-PCR. In future we want to extend splice variant analysis to other chromosomes and gene families. We intend to automate RT-PCR and ultimately design a chip to discriminate a large number of different splice forms of medically relevant genes.