Skip to main content
SHARE
Publication

Conserved synteny at the protein family level reveals genes underlying Shewanella species' cold tolerance and predicts their ...

Publication Type
Journal
Journal Name
PLoS Computational Biology
Publication Date
Page Numbers
1 to 10
Volume
10
Issue
1

In spite of a rapid growth in the number of sequenced bacteria and significant progress in the annotation of their genomes, current computational technologies are limited in their capability to associate the genotype of a sequenced bacterial organism with its phenotypic traits. We evaluated two novel, complimentary approaches that can facilitate this task. They are based on correlation between the numbers of the trait-specific protein families or Pfam domains and a quantitative characteristic of the phenotypic trait among different bacterial species. Our first, a top-down approach, involves quantification and comparison of a higher-level characteristic, a bacterial phenotype, to reveal genomic characteristics and specific genes related to the phenotype. The second, a bottom-up approach, predicts phenotypes by quantification of molecular functions in the genomes of closely related bacterial species and by following pair-wise correlation of the molecular functions enrichments and their network clustering. The approach is implemented using network analysis tools. The approaches were validated by a comparison of 19 sequenced Shewanella species. Using the first approach, we were able to identify specific domains and gene clusters associated with cold tolerance of these mesophilic species and to predict some novel cellular mechanisms underlying the phenotype. We find that in three tested species both cold and salt tolerance relate to presence in their genome of a specific Na+/H+ antiporter. By using the second approach we identified genomic clusters predicting several environmentally relevant phenotypes in the newly sequenced Shewanella species including degradation of aromatic compounds by an aerobic hybrid pathway, utilization of ethanolamine, and arsenic and copper resistance. Results of the study confirm validity of the approaches and their utility for (i) computational predictions of phenotypic traits in the sequenced organisms, (ii) revealing genomic determinants of known complex phenotypes, (iii) orthologs prediction, and for (iv) discovery of function of unknown domains and hypothetical proteins.