DOE Genome Contractor-Grantee Workshop IX
January 27-31, 2002 Oakland, CA
Microbial Genome Program Abstracts
132. Optical Map Based Sequence Validation of Microbes
Marco Antoniotti1, Thomas Anantharaman2, Violet
Chang1, David Schwartz3, and Bud Mishra1
The research activity of NYU bioinformatics groups is centered on the algorithms for mapping and sequencing projects and has been focused on the specification languages, environments and systems for bioinformatics. Over the last eight months, we have used the bioinformatics system to develop a suite of mathematical models and associated algorithms for the Validation, Alignment, and Restriction Fragments Translocation Detection and Correction using publicly available optical mapping data and related sequence data. These problems are ideal for testing out the suitability of our software system as it addresses new challenges posed by the availability of vast amount of biological data, especially in the form of DNA sequences. In particular, the Ordered Restriction Mapping bioinformatics manipulation tools, we have developed, serve the dual purpose of improving the DNA sequencing efforts and providing new analysis capabilities that can be derived from the maps themselves directly.
The Validation algorithm we have developed for Ordered Restriction Maps serves to establish the quality of an assembled DNA sequence by comparing it with to an Ordered Restriction Map (e.g. a map obtained via the Optical Mapping Process). The core procedure of the Validation algorithm takes a DNA sequence (retrievable from a variety of sources) and an Ordered Restriction Map (called the “consensus” map). From the DNA sequence, an “in silico” ordered restriction map is obtained (called the “sequence” map). The sequence map is aligned against the original map using a sophisticated dynamic programming formulation. The procedure constructs several alignments of the sequence map against the original map. Each such alignment is ranked according to the value of the computation of a Maximum Likelihood Estimate of a statistical model that takes into account several error sources of the underlying biochemical process. For the Optical Mapping process the error sources considered are
Orientation is also taken into account, since the consensus map can be in either 3' or 5' sense.
The (Multiple) Alignment algorithm takes as input an Ordered Restriction (Consensus) Map and a set of (small) sequence contigs. Each sequence contig is run through the Validation subsystem and its possible alignments are computed and set aside for future post-processing. Subsequently the Alignment algorithm constructs a putative anchoring of every contig on the consensus map, by selecting one alignment per contig. The selection procedure represents a trade-off among the following criteria
Objective 2 and 3 are contradictory; hence we developed a Lagrangian-like approximation scheme that weighs one or the other criteria under user control. The first criteria may be relaxed in order to look for overlaps that were possibly missed by the contig sequencing and assembly procedure. As our preliminary analysis led us to believe that the problem of construction is likely to be computationally infeasible, we developed two procedures that approximate its construction. The first is a simple Greedy approach that has worked well in our experiments, and the second is an iterative 1D Dynamic Programming procedure that minimizes a weighted cost function subject the constraints 1 to 3 above.
The Restriction Fragment Translocation (RFT) Detection and Correction algorithm reuses the basic Validation algorithm by considering a consensus map, a sequence map and all the sub-sequences of the sequence map. The validation algorithm is run on the N^2 sub-maps of the sequence map, in order to determine whether any of the sub-maps can be anchored in a different position than the one assigned by the sequence map alignment. The result of the RFT algorithm is an ordered set of rearrangements of the sequence sub-maps.
The three Ordered Restriction Maps algorithms (Validation, Alignment, and RFT Detection/Correction) do not work in isolation. While we developed the mathematical and statistical models that constitute the core of the three algorithms, we also developed a software infrastructure integrating the components and based on a DataBase. To achieve the software integration, we developed the specification of file exchange formats and several auxiliary programs used in a variety of ways (e.g. a sequences and maps “simulator” which can be used to generate in silico sequences and maps of various complexity and structure). The three algorithms produce large data sets. In order to navigate the data sets in a more interactive way, we developed two specialized viewers called “CONVex” and “genscape.” “genscape” is an evolution of “CONVex.” The viewers interact with the underlying infrastructure. The main idea behind the two viewers is to provide a zoomable view of the set of alignment, and to enable the user to inspect the displayed maps (consensus and sequence) at a fine detail level. The viewers are also available as libraries and have been integrated in the VALIS system. All of this software infrastructure has been made available publicly through the Internet and has also been made specifically available to University Wisconsin.
The three algorithms have been tested in a variety of ways. In particular we concentrated on analyzing the P. falciparum parasite (the Malaria agent): an organism for which there are both published sequences and published Ordered (Optical) Restriction Maps. We downloaded the known sequences of the P. falciparum parasite’s 14 Chromosomes from the PlasmoDB online database (www.plasmodb.org). Only Chromosome2 and 3 have been fully assembled so far. For the remaining 12 we only have sets of contigs, for which no known position along the respective chromosome is published. We ran the Validation algorithm on the P. falciparum chromosome2 and 3 sequence data, against the Ordered (Optical) Restriction Maps. The results show very good agreement between the consensus map and sequence map. Hence we can conclude that both consensus and sequence maps are correct. For the remaining 12 P. falciparum chromosomes, we ran the Alignment algorithm and we were able to propose anchoring positions for all the contigs along the respective consensus maps. All these results are viewable at http://bioinformatics.cat.nyu.edu/valis under the “Projects” link.
133. Interaction of Cytochrome c3 with Uranium
Judy D. Wall and Barbara Rapp-Giles
Biochemistry Department, University of Missouri-Columbia
Several years ago, the reduction of soluble U(VI) to U(IV), a much less soluble form, was demonstrated in cell extracts of Desulfovibrio vulgaris Hildenborough with hydrogen gas as the reductant. Further experimentation demonstrated that the reduction was dependent on presence of cytochrome c3 in the extract and, of course, hydrogenase. To determine whether cytochrome c3 is the actual reductase of bacteria in this genus, we are preparing purified protein for the application of analytical tools by our Los Alamos collaborators led by Dr. William Woodruff. We have constructed a mutant of D. desulfuricans carrying an interrupted cycA gene encoding cytochrome c3. We are now attempting to create a stable mutation by marker exchange mutagenesis. This mutant strain will be necessary to produce mutant forms of the protein for testing the mechanism of the interaction of the protein with uranium, if any.
134. Genome-Wide Functional Analysis of the Metal-Reducing Bacterium Shewanella oneidensis MR-1: Progress Summary Alexander Beliaev1, Dorothea K. Thompson1,2, Carol S. Giometti3, Kenneth H. Nealson4, Alison E. Murray 2, James M. Tiedje2, and Jizhong Zhou1,2.
1Environmental Sciences Division, Oak Ridge National Laboratory†,
Oak Ridge, TN
Large-scale sequencing of entire microbial genomes has ushered in a new era in biology, but the greatest challenge will be to define gene function and complex regulatory networks at the whole-genome level. In this Microbial Genome Project, we proposed to conduct a microarray-based functional genomic study to elucidate the genes and regulatory mechanisms involved in energy metabolism in Shewanella oneidensis MR-1. To study the genes and regulatory schemes underlying anaerobic respiration, wild-type and mutant strains of S. oneidensis were examined using DNA microarrays containing 691 open reading frames (ORFs) and 2-D polyacrylamide gel electrophoresis (2-D PAGE). Insertional mutants defective in the Fnr-like etrA (electron transport regulator A) and fur (ferric uptake regulator) genes were generated by suicide plasmid integration and characterized. Disruption of the etrA gene resulted in altered mRNA levels for 69 genes with predicted functions in energy metabolism, transcription regulation, substrate transport, and biosynthesis. In this subset, up to a 12-fold decrease in mRNA abundance was displayed by genes involved in anaerobic respiration (dmsAB, hydABC, fdhAC), while aerobic genes encoding cytochrome oxidases, NADH dehydrogenase, and TCA cycle enzymes were induced up to 3-fold as a result of the etrA mutation. Notably, disruption of etrA affected the transcription of ten regulatory genes, including fur and hutC (histidine utilization?). Our results suggest that EtrA plays a subtle role in MR-1 anaerobic gene regulation and is not essential for growth and reduction of electron acceptors.
Microarray analysis of a fur knockout strain (FUR1) revealed that genes with predicted functions in electron transport, energy metabolism, transcription regulation, and oxidative stress protection were either repressed (ccoNQ, etrA, cytochrome b- and c maturation-encoding genes, qor, yiaY, sodB, rpoH, phoB, chvI) or induced (yggW, pdhC, prpC, aceE, fdhD, ppc) in a fur- background. As expected, disruption of fur also resulted in derepression of genes putatively involved in siderophore biosynthesis and iron uptake. Analysis of a subset of the FUR1 proteome (i.e., primarily soluble cytoplasmic and periplasmic proteins) indicated that 11 major protein species reproducibly showed significant (P < 0.05) differences in abundance relative to the wild type. Protein identification using mass spectrometry revealed that the expression of two of these proteins (SodB and AlcC) correlated with the microarray data. Microarray data and sequence analysis suggest that Fur may act with EtrA and possibly other regulatory proteins to coordinate the synthesis of iron-containing enzymes and cytochromes with iron uptake and respiration. While our findings agree with previous descriptions of Fur as a repressor of iron acquisition genes, this study also suggests that MR-1 Fur plays a role in the coordinate regulation of energy metabolism.
In response to changes in redox and growth conditions, 121 genes out of the 691 arrayed ORFs displayed at least a 2-fold difference in transcript abundance in wild-type S. oneidensis MR-1. Genes induced during anaerobic respiration included those involved in cofactor biosynthesis and assembly (moaACE, ccmHF, cysG), substrate transport (cysUP, cysTWA, dcuB), and anaerobic energy metabolism (dmsAB, psrC, pshA, hyaABC, hydABC). Transcription of genes encoding a periplasmic nitrate reductase (napDAGHB), cytochrome c552, and prismane was elevated 8- to 56-fold in response to the presence of nitrate, while cymA, ifcA, and frdA were specifically induced 3- to 8-fold under fumarate-reducing conditions. In addition, we have conducted experiments with S. oneidensis MR-1 partial microarrays to determine differential gene expression under iron- and manganese-reducing conditions. Complete linkage hierarchical cluster analysis identified clusters with constitutively expressed genes, those that were anaerobically or aerobically induced with both Mn(IV) and Fe(III) serving as terminal electron acceptors, and those that were either induced with iron and not manganese or vice versa. Several electron transport carriers including NADH dehydrogenase, ubiquinone (ubiH), cytochromes b and c1, and a membrane-bound c-type oxidase were induced under all Mn(IV) and Fe(III) experiments. Genes encoding a number of electron transport carriers (dehydrogenases and cytochromes) as well as stress response proteins were induced only with iron as the electron acceptor. Perhaps the most interesting gene, and one with the highest induction under iron reduction, was one encoding N-acylhomoserine lactone synthase, a key regulator of quorum sensing in Gram-negative bacteria. These experiments demonstrate that genes unique to different electron acceptors can be revealed by microarray hybridization. Additional experiments using mutagenesis and whole-genome microarrays (expected to be completed by the end of 2001) will be conducted in order to define the components and mechanisms of metal-reducing pathways in MR-1.
Finally, partial microarrays have been used to define genome relationships in the Shewanella genus. DNA:DNA hybridization experiments allowed us to visualize the relationships between organisms in the Shewanella genus by comparing individual ORF hybridizations for partial genome arrays. Results from those experiments have shown that other Shewanella species hybridize to the MR-1 array, and that some suites of electron accepting and regulatory genes (e.g., arcA) are highly conserved within the halotolerant branch of the Shewanella genus. Thus, we believe that gene expression data obtained in the proposed work will be relevant to a broader diversity of organisms that are ubiquitous in the environment.
†Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract number DE-AC05-00OR22725.
135. Microarray Analysis of Sugar Metabolism Gene Networks in Thermotoga maritima Arvin D. Ejaz1, Amy M. Mikula1, Tu Nguyen2, Ken Noll2, Karen E. Nelson1, and Steven R. Gill1
1The Institute for Genomic Research, Rockville, MD 20850
The thermophilic bacterium, Thermotoga maritima, is a heterotrophic organism capable of metabolizing complex carbohydrates such as cellulose and xylan. Since cellulose and xylan are major components of plant biomass, their conversion into fuels and chemicals has a significant economic potential. Once the regulation and dynamics of T. maritima’s genes are understood, it may become an important organism in high temperature industrial processes that convert plant biomass into useful energy. Previous sequencing of the T. maritima MSB8 genome identified genes involved in cellulolytic and xylanolytic pathways. In an effort to elucidate T. maritima genes and regulatory networks involved in metabolism of simple and complex carbohydrates, we have constructed a whole genome T. maritima microarray. We are currently using these arrays to investigate gene regulation of T. maritima grown in continuous cultures with media containing either glucose, lactose or maltose as carbon sources. We will present data on T. maritima microarray construction and the results of these experiments.
136. Gene Expression Profiles in Nitrosomonas europaea, An Obligate Chemolithoautotroph Daniel Arp1, Martin Klotz2, and Jizhong Zhou3
1Oregon State University
Ammonia-oxidizing bacteria are participants in both the C and N cycles. These bacteria are obligate autotrophsthey obtain all of their carbon for growth from CO2, and obligate chemolithotrophsthey derive all their reducing power and energy necessary for biosynthesis from the transformation of NH3 to NO2-. Ammonia-oxidizing bacteria can have profound effects on the environment. Upon oxidation, ammonia applied to croplands is mobilized and can leach into ground and surface waters. Ammonia-oxidizing bacteria also produce the greenhouse gases NO and N2O. Given the broad substrate specificity of the oxygenase that initiates the oxidation of ammonia, ammonia-oxidizing bacteria also have the potential to initiate the degradation of several environmental pollutants (e.g. trichloroethylene). Basic and applied research is essential to understand how ammonia-oxidizing bacteria respond to changes in their environment. Nitrosomonas europaea is the best characterized of these bacteria. To date, molecular investigations of N. europaea have focused primarily on single genes and enzymes involved in the oxidation of ammonia. These studies have revealed surprisingly strong responses to environmental changes, especially given the obligate dependence on ammonia and CO2. With the sequencing of the genome of this bacterium through the DOE Microbial Genome Program (http://bahama.jgi-psf.org/prod/bin/microbes/neur/home.neur.cgi), and the development of methods for genetic manipulation, it is now possible to investigate the expression of the entire complement of N. europaea genes by applying microarray-based genomic technology. The results will provide insights into how this bacterium responds to changes, but should also provide insights to how autotrophs and lithotrophs in general are modulating their gene expression in response to nutrient changes and environmental stresses.
The specific research objectives of the research are to:
137. Improving Functional Analysis of Genes Relevant to Environmental Restoration via an Analysis of the Genome of Geobacter sulfurreducens
Derek R. Lovley, Madellina Coppi, Stacy Cuifo, Susan Childers, Ching Lean, Franz Kaufmann, Daneil Bond, Teena Mehta, and Mary Rothermich
Department of Microbiology, University of Massachusetts, Amherst, MA 01003
Better information is required in order to predict the function of genes, in pure cultures of microorganisms or microbial communities, that are involved in important environmental processes such as the remediation of toxic wastes. Geobacter species have novel physiological characteristics that make them ideally suited for the bioremediation of radioactive metals and organic contaminants in subsurface environments. Furthermore, molecular studies have demonstrated that Geobacter species are dominant members of microbial communities in geographically and geochemically diverse subsurface environments in which microorganisms are actively bioremediating metal or organic contamination. This represents a rare instance in which an organism that is known to be numerically significant and active in an environmental process of interest is also available in pure culture.
Analysis of the complete genome sequence of Geobacter sulfurreducens has revealed that this organism has a high percentage of genes for putative electron-transport proteins and that the expression of these genes is likely to be highly regulated. In order to begin elucidating the function of genes involved in electron transport to metal electron acceptors, G. sulfurreducens was grown under steady-state conditions in chemostats with different electron acceptors. Molecular and biochemical analyses demonstrated that genes for a number of c-type cytochromes were specifically expressed only when G. sulfurreducens was grown with Fe(III) as the electron acceptor. They were not expressed when fumarate served as the electron acceptor. Physiological studies of knock-out mutants that no longer expressed certain c-type cytochromes suggested that some of the c-type cytochromes are intermediary electron transport proteins, but that at least one of the c-type cytochromes might function as a terminal metal reductase. Comparison of gene expression between cells grown on soluble Fe(III)-citrate and insoluble Fe(III) oxide demonstrated that genes for the production of pili are specifically expressed during growth on Fe(III) oxide. Differential production of pili was confirmed with electron microscopy. A knock-out mutation that eliminated expression of pilA, the gene for the structural pilin protein, had no effect on the ability of the cells to grow with Fe(III) citrate, but abolished their ability to grow with Fe(III) oxide as the electron acceptor. Complementation with a functional pilA gene restored the capacity for growth on Fe(III) oxide. This is the first description of a protein specifically required for a dissimilatory metal-reducing microorganism to grow on Fe(III) oxide, the primary electron acceptor for metal-reducing microorganisms in subsurface environments.
These results demonstrate that the strategy of examining differential gene expression followed by a genetic analysis of gene function is rapidly elucidating important aspects of Geobacter physiology. This in turn, is providing important insights into the mechanisms by which Geobacter functions in the bioremediation of metal and organic contamination in the subsurface.
138. Genome Sequencing of Gemmata obscuriglobus Naomi Ward1, Margaret K. Butler2, Rebecca L. Smith2, and John A. Fuerst2
1The Institute for Genomic Research, Rockville, MD 20850
Gemmata obscuriglobus is a member of the planctomycete group of Bacteria. These organisms possess a unique combination of morphological and ultrastructural properties, including budding replication, the presence of crateriform structures of unknown function on the cell surface, a diverse range of extracellular appendages, and lack of the “universal” cell wall polymer peptidoglycan. In recent years, the planctomycetes have been found to be widely distributed and often numerically abundant in both aquatic and terrestrial environments. This group also includes the “missing lithotrophs” performing the “anammox” process - organisms that can break down ammonia in wastewater anaerobically; this ecological niche has been postulated for many years, but the organisms performing this role have only recently been identified as planctomycetes. Other proposed environmental roles include degradation of chitin in marine systems, and the breakdown of toxic algal blooms. Lastly, they are a phylogenetically distinct lineage within the Bacteria, and there are currently no published genome sequences from members of this group. G. obscuriglobus was the first planctomycete, and indeed first bacterium, shown to possess a membrane-bounded DNA-containing nuclear region, i.e., a structure analogous to the eukaryotic nucleus. This provides an ultrastructural exception to the prokaryote/eukaryote dichotomy, and has interesting implications for transport within the cell, and the linking of transcription and translation processes. These unique features among the Bacteria may have wide implications for discovery of new mechanisms in molecular cell biology correlated with cell compartmentalization. Other planctomycetes were subsequently shown to exhibit various types of cellular compartmental-ization, suggesting that this may be a widespread property of members of this group. The genome size of G. obscuriglobus is at the upper limits of known bacterial genomes; PFGE-based analyses suggest a genome size of approximately 9Mb. Comparable genome sizes are seen in developmentally complex bacteria such as Myxococcus xanthus and Streptomyces spp. Availability of genome sequence data from G. obscuriglobus will allow comparative analysis of another large genome, and insight into the evolutionary mechanisms which have led to these genomic expansions. At the time of writing, the genome sequencing project was in the library construction phase. A summary of the current status of the project will be presented.
139. Genome Sequence of Methanococcus maripaludis, a Genetically Tractable Methanogen
Erik L. Hendrickson1, Maynard Olson2, Gary Olsen3, and John A. Leigh1
1Department of Microbiology, University of Washington
We have sequenced the genome of Methanococcus maripaludis strain LL, a mesophilic methanogenic archaeon, to six-fold coverage. Assembly of the partial sequence has yielded 163 contigs, ranging in size from 0.19 to 106 kb in length covering a total of 1.71 Mb. The total is comparable to that of the most closely related organism with a complete genome sequence, Methanococcus jannaschii, with a total genome length of 1.66 Mb. GC content is 33%, again comparable to M jannaschii (31%). Potential open reading frames have been identified by the CRITICA program, which predicts 1742 orfs, similar to M. jannaschii, with 1738. 1522 of the predicted orfs yield BLAST homologies, the majority of which show their highest homologies to M. jannaschii (67%). The rest have their highest homologies to other methanogens (13%), other Archaea (8%), and Bacteria (10%), with only a few having their best match in Eukarya (1%) and four matching viruses. As M. maripaludis derives its energy and carbon from formate or H2 and CO2, we examined the sequence for the corresponding metabolic genes. The sequence contains genes for a complete methanogenesis pathway as well as formate dehydrogenase, with similar organization to that occurring in M. jannaschii. Preparations are under way to finish the genome, and for post-genomic studies.
Collaborators include M. Hackett, R. Bumgarner, and R. Samudrala (University of Washington), W. Whitman and J. Amster (University of Georgia), and D. Söll (Yale University).
140. The Genome of Ferroplasma acidarmanus: Clues to Life in Acid Larry Croft1, Amanda Barry2, Paul Predki3, Stephanie Stilwagen3, Genevieve Johnson3, Thomas M. Gihring1, Brett J. Baker 1, Jennifer Macalady1, George F. Mayhew4, Valerie Burland4, Teresa Janecki3, Charles W. Kaspar5, Brian Fox2, and Jillian F. Banfield1
1Department of Geology and Geophysics, University of Wisconsin-Madison
The archaeon Ferroplasma acidarmanus populates hot, acidic (pH 0-3), metal-rich solutions associated with acid mine drainage sites. The multiple challenges posed by its environment make it an excellent model organism for study of genes involved in metal and acid tolerance. Because it is an obligate acidophile, it is also ideal for investigation of lateral gene transfer as it is effectively separated from most of the biosphere by its environment. The 2Mb genome was sequenced and is 15% homologous (blast expectation < 1e-45) by peptide similarity to T. acidophilum, an acidophilic scavenger, the closest sequenced relative to F. acidarmanus. Incomplete amino acid biosynthetic pathways and an array of intra and extracellular proteases and amino acid pumps support heterotrophic growth and indicate a complex dependence on other community members for essential organic compounds.
The F. acidarmanus proteome is also 4% homologous to Sulfolobus solfataricus, another archaeal acidophile. This is much more than would be expected by evolutionary relatedness alone. It is highly likely that lateral gene transfer has occurred between Sulfolobus spp. and F. acidarmanus. Bacteriophage sequences and transposons in the genome suggest vehicles for lateral gene transfer.
Using TMHMM, 23% of proteins in the F. acidarmanus proteome were identified as membrane bound and 12% of proteins were permeases or created permease-like structures. Over 25% of all proteins have not been previously identified (singletons) and a large proportion (39%) of these are membrane bound, suggesting much unknown membrane associated extracellular activity. 23% of proteins have similarity to proteins from other organisms but have unknown function, most of these are also membrane bound (18% of proteome). There appears to be no discernible amino acid bias between cytoplasmic peptides and regions of peptides exposed to the extracellular environment. This suggests that protein secondary or tertiary structure or modification plays a significant role in acid stability of extracellular proteins.
Acid mine drainage contains high concentrations of toxic metal species such as arsenic, which are removed from the cytoplasm by a set of metal efflux pumps, and detoxified by metal reducing enzymes (such as mercury reductases). Further protection is afforded by a predominantly tetraether-linked lipid membrane monolayer that makes feasible the large proton gradient between the pH ~ 5.2 cytoplasm and the pH < 1.0 environment.
Several F. acidarmanus genes have been expressed in E. coli, including two Rieske-type iron-sulfur proteins, a blue copper protein, two cytochrome p450-like proteins, a cobalamin biosynthesis protein, and an iron superoxide dismutase. Proteomic studies of gene expression when F. acidarmanus is grown heterotrophically and on ferrous iron are currently underway. Protein expression studies of a candidate tetraether lipid synthesis pathway are in progress.
141. Genome Sequence of the Metal-Reducing Bacterium, Shewanella oneidensis John F. Heidelberg1, Ian T. Paulsen1, Karen E. Nelson1, William C. Nelson1, Jonathan A. Eisen1, Barbara Methe1, Eric J. Gaidos3, Owen White1, Kenneth H. Nealson2, and Claire M. Fraser1
1The Institute for Genomic Research, 9712 Medical Center Drive,
Rockville, MD 20850 USA
The Shewanella oneidensis genomic sequence will expedite efforts to use this organism for bioremediation of dissolved toxic metals and organic toxins from water supplies. This bacterium, and other metal-reducing bacteria, uses metals (rather than oxygen) as the terminal electron acceptors for anaerobic respiration. This respiratory capability makes it a valuable tool in the removal of toxic metals such as uranium and chromium. Here we report the compete genome of S. oneidensis MR-1. Shewanella oneidensis is notable for its inability to grow on a wide variety of carbon sources, rather, the organism seems to be tuned to the use of fermentation end-products, and the use of the pyruvate formate lyase reaction under anaerobic conditions. Shewanella oneidensis is unusual for a gamma proteobacterium, containing a very high number of multi-heme cytochrome c genes, and having the diversity of electron transport capacities it possesses. Insertion sequences compose 5.5% of the genome sequence and, likely play a critical role in shaping the genome.
142. The Colwellia Strain 34H Genome Sequencing Project Barbara Methe1, Matthew Lewis1, Bruce Weaver1, Jan Weidman1, William Nelson1, Adrienne Huston2, Jody Deming2, and Claire Fraser1
1The Institute for Genomic Research, Rockville, MD 20850
Approximately 7% percent of the Earth’s surface is covered by sea ice and by volume about 90% of the world’s oceans exist at a temperature of 5°C or less. As a result, large regions of the marine ecosystem are permanently cold and colonized principally by cold-adapted microorganisms. The sequencing of the entire genome of Colwellia strain 34H will provide the first complete assembly of a psychrophilic (growth optima < 15°C and maximum growth <20°C) bacterium. As a member of the gamma sub-class of the proteobacteria, the genus Colwellia represents a group of obligate marine bacteria many of which are psychrophiles, that play important roles in carbon and nutrient cycling in polar marine environments. Of particular interest is the capability of Colwellia to produce cold-adapted enzymes that have potentially important applications for use in the fields of biotechnology and bioremediation. Recent biochemical investigations of strain 34H have demonstrated its ability to release cold-adapted extracellular proteases with the lowest activity optima yet reported for a cell-free extract from a pure culture.
With primary funding from the Department of Energy, the Colwellia strain 34H genome project has commenced using a random shotgun approach which has already provided approximately eight-fold coverage of a 5.3 Mb genome. Closure of the remaining physical and sequencing gaps and resolving and ordering of RNA operons is now in progress using a variety of directed sequencing strategies including: sequencing from primers designed to point into gaps, multiplex PCR, micro-library construction and transposon mutagenesis of appropriate library clones. A suite of software programs developed by The Institute for Genomic Research is also being employed to assemble the genome, aid in gap closing, assembly verification and annotation. Examination of this genome will improve our understanding of the adaptations of this organism to cold marine environments, which in turn has important implications in areas as diverse as microbial ecology, evolution, biotechnology and bioremediation.
143. Complete Genome Sequence of Acidithiobacillus ferrooxidans Strain ATCC23270 Herve Tettelin1, Keita Geer1, Jessica Vamathevan1, Florenta Riggs1, Joel Malek1, Maureen Levins1, Mobolanle Ayodeji1, Sofiya Shatsman1, Getahun Tsegaye1, Stephanie McGann1, Robert J. Dodson1, Robert Blake2, and Claire Fraser1
1The Institute for Genomic Research, 9712 Medical Center Drive,
Rockville, MD 20850
Acidithiobacillus ferrooxidans is a Gram-negative bacterium of industrial and environmental relevance. It is a major component in the consortia of microorganisms used in biomining and a contributor to acid mine runoff, which results in pollution near metal and coal mines and other related environ-ments. A. ferrooxidans is acidophile: optimal pH between 1.5 and 2.0, mesophile: under 45°C, and chemolithoautotroph. It gains energy from oxidative phosphorylation, obtains nitrogen from N2 in the air and carbon exclusively from CO2 fixation. It derives energy from oxidation of reduced inorganic sulfur to H2SO4 and oxidation of Fe2+ to Fe3+ which precipitates as insoluble Fe(OH)3. The 2.9 Mb chromosomeof A. ferrooxidans ATCC23270 was sequenced to 8x coverage with 50,136 shotgun sequences derived from small (ca. 2 kb, 5x) and large (ca. 10 kb, 3x) insert shotgun libraries, and currently undergoes the final stages of gap closure. The genomic sequence displays a G+C content of 58.4% and contains 61 repeats larger than 500 bp. It was annotated in a fully automated fashion, which will be followed by manual curation (expected to occur after this workshop) of each open reading frame. However, the available annotation will be sufficient to derive meaningful information about the metabolic pathways that are critical to the life of this organism in its inorganic environment. In addition, we will focus on the set of surface (including transporters) and secreted proteins that allow A. ferrooxidans to feed on ores. Results from analyses on the structure of the chromosome, including regions of atypical nucleotide composition, putative islands of horizontal transfer, recent gene duplications, etc. will also be discussed.
144. The Complete Genome Sequence of the Green Sulfur Bacterium Chlorobium tepidum Jonathan A. Eisen1, Karen E. Nelson1, Ian T. Paulsen1, John F. Heidelberg1, Martin Wu1, Robert J. Dodson1, Robert Deboy1, Michelle L. Gwinn1, William C. Nelson1, Daniel H. Haft1, Erin K. Hickey1, Jeremy D. Peterson1, A. Scott Durkin1, James L. Kolonay1, Fan Yang1, Ingeborg Holt1, Lowell A. Umayam1, Tanya Mason1, Michael Brenner1, Terrance P. Shea1, Debbie Parksey1, Tamara V. Feldblyum1, Cheryl L. Hansen1, M. Brook Craven1, Diana Radune1, Jessica Vamathevan1, Hoda Khouri1, Owen White1, J.Craig Venter5, Tanja M. Gruber4, Karen A. Ketchum5, Hervé Tettelin1, Donald A. Bryant3, and Claire M. Fraser1,2
1The Institute for Genomic Research, 9712 Medical Center Drive,
Rockville, MD 20850 USA
The complete genome of the green-sulfur eubacterium Chlorobium tepidum TLS was determined to be a single circular chromosome of 2,154,946 base pairs. This represents the first genome sequence from the phylum Chlorobia, whose members perform anoxygenic photosynthesis by the reductive TCA cycle. Genome comparisons have identified genes in C. tepidum that are highly conserved among photosynthetic species. Many of these have no assigned function and may play novel roles in photosynthesis or photobiology. Numerous duplications of genes involved in biosynthetic pathways for photosynthesis and the metabolism of sulfur and nitrogen were identified. Thirty-eight percent of predicted proteins with likely roles in central intermediary metabolism are most similar to proteins from Archaeal species. Evidence suggests many of these were acquired by lateral gene transfer.
145. The Genome Sequences of Bacillus anthracis Strain Ames T. D. Read , E. Holtzapple, and S. Peterson
The Institute for Genomic Research
Whole-genome sequencing of a Bacillus anthracis Ames isolate (pXO1- pXO2-) is nearing completion. The initial phase of the project, random sequencing of small- and large- insert libraries has been completed and efforts are being directed currently to closing gaps between assemblies. Many portions of the B. anthracis sequence appear to have similar gene content and organization to the archetypal non-pathogenic B. subtilis and to the recently sequenced B. halodurans genome. At least 60% of B. anthracis ORFs have homologues to known B. subtilis genes. These include many spore-coat and spore-germination determinants believed to play in important role in virulence. There are many genes without homologues in B. subtilis that could be important in anthrax infection, including several hemolysins and phospholipase genes. Also notable was the presence in the genome of numerous copies of a conserved 16 bp palidrome known to be a target of the B. thuriengiensis positive regulator of extracellular virulence determinants, PlcR. However, the B. anthracis plcR gene contains a potential loss-of-function deletion. The pXO plasmids that contain the key virulence genes encoding toxin and capsule have recently been sequenced. Although the plasmids appear to have undergone frequent rearrangements, there are few apparent instances of gene transfer between plasmid and chromosome, suggesting possible recent arrival of the episome into B. anthracis. We have also been recently funded to sequence the Ames strain isolated from the Florida bioterror attack and will present strain comparisons with the ‘laboratory’ Ames strain.
146. The Complete Genome Sequence of Pseudomonas putida KT 2440 Karen E. Nelson1, Burkhard Tuemmler2, and Claire M. Fraser1
1The Institute for Genomic Research, Rockville, MD 20850
Pseudomonas putida is a commonly found soil bacterium that is also a biocontrol agent for plant pathogens, and has a broad capacity for bioremediation and biotransformation. In an attempt to characterize the species, the type strain KT2440 was sequenced by the random shotgun procedure. The 6.18 Mb genome is composed of 5427 open reading frames, 11% of which are unique to the bacterium. The genome sequence reveals numerous transport and metabolic systems that relate to the organisms versatility. As expected, there is a high level of gene conservation with the pathogenic species Pseudomonas aeruginosa, a major cause of opportunistic human infections including cystic fibrosis. The genomic differences that contribute to variations in abilities of the two species have been highlighted by whole genome comparisons, and will be presented.
147. Genome Sequence of Methylococcus capsulatus Naomi Ward1, Jonathan Eisen1, Claire Fraser1, George Dimitrov1, Scott Durkin1, Lingxia Jiang1, Hoda Khouri1, Katherine Lee1, David Scanlan1, Nils Kåre Birkeland2, Live Bruseth2, Ingvar Eidhammer2, Svenn H. Grindhaug2, Ingeborg Holt2, Harald B. Jensen2, Inge Jonasen2, Øivind Larsen2, and Johan Lillehaug2
1The Institute for Genomic Research, Rockville, MD
Methylococcus capsulatus (Bath) is a Gram-negative aerobic bacterium (family Methylococcaceae, gammaproteobacteria) capable of using methane as a sole carbon and energy source. Methane is oxidized via methanol to formaldehyde, which is either assimilated into cellular biomass or dissimilated to carbon dioxide. Methanotrophs such as M. capsulatus are responsible for the oxidation of methane produced through methanogenesis, and are therefore of environmental importance in reducing the amount of greenhouse gases formed in the Earth’s atmosphere. M. capsulatus (Bath) also has considerable potential for large-scale commercial production of microbial proteins by fermentation, due to its ability to grow to high cell density with only natural gas as a carbon source. The 3.3 Mbp M. capsulatus (Bath) genome was sequenced by the random shotgun sequencing strategy, in a collaboration between TIGR and The University of Bergen. At the time of writing, the genome is in gap closure and consists of a single group of contigs assembled from 41,368 individual sequences. A summary of the current status and preliminary annotation will be presented.
148. Comparative Genomic Sequence Analysis of Three Strains of the Plant Pathogen, Xylella fastidiosa
S. Stilwagen1, P. F. Predki1, A. Bhattacharyya3, H. Feil4, W. S. Feil4, F. Larimer2, K. Frankel1, S. Lucas1, D. Rokhsar1, E. Branscomb1, and T. Hawkins1
1U.S. DOE Joint Genome Institute, Walnut Creek, CA 94598
The Joint Genome Institute (JGI) has shotgun sequenced the genomes of two strains of the fastidious, xylem-limited bacteria, Xylella fastidiosa, to high draft (eightfold coverage). This gram negative bacterium causes a range of economically important diseases which include Pierce’s disease (PD) in grapevines and citrus variegated chlorosis (CVC) in citrus plants. The diseases caused by this plant pathogen are responsible for major economic and crop losses globally. We present here the comparative analysis of the ordered and oriented genome sequences of the strains X.fastidiosa pv. almond and X.fastidiosa pv. oleander versus the finished genome of Xylella fastidiosa pv. citrus. Our analyses will illustrate not only the utility of high draft genome sequences but will also identify the signature features of the Xylella genomes and reveal the high, yet broad conservation of the gene repertoire across the three strains. We will further present our findings regarding putative candidate genes which have resulted from horizontal gene transfer.
This work was performed under the auspices of the U.S. Department of Energy, Office of Biological and Environmental Research, by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, the Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098, and the Los Alamos National Laboratory under contract No. W-7405-ENG-36.
149. Finishing/Investigating the Genomes of Prochlorococcus, Synechococcus, and Nitrosomonas: An Overview P. Chain1, W. Regala1, L. Vergez1, S. Stilwagen2, F. Larimer3, D.Arp4, N. Hommes4, A. Hooper5, S. Chisholm6, G. Rocap7, B. Brahamsha8, B. Palenik8, and J. Lamerdin1
1Lawrence Livermore National Laboratory, Livermore, CA
The output of sequence data from sequencing centers, such as the DOE’s Joint Genome Institute, has been rising at an exponential rate for the past decade or two. The increase in sequencing efficiency over the past few years has resulted in a bottleneck shift, from the accumulation of raw data to the finishing, annotation and analysis of genomes. The first two publications describing complete microbial genomes were reported in 1995. Only seven years later, there are approximately 60 complete, annotated microbial genomes available, along with published draft analyses of several multi-cellular eukaryotes. However, an even greater number of projects are either currently underway or are awaiting the finishing process, which provides a complete picture of the genome including contextual information, captures all the sequences missed in the draft phase, and adds a level of confidence to the genomic sequence.
In support of the DOE’s Carbon Sequestration and Management Program, we undertook the challenging task of finishing the genomes of three autotrophic bacteria which play unique roles in their soil and ocean ecosystems. The genomes of Prochlorococcus marinus MIT9313 and Synechococcus sp. WH8103, two cyanobacteria, had been drafted to 7-fold coverage by the JGI, while Nitrosomonas europaea sp. Schmidt was at near 14-fold coverage. Nitrosomonas europaea is an obligate ammonia-oxidizing beta-proteobacteria that can meet its carbon requirements entirely through the fixation of carbon dioxide, while Prochlorococcus and Synechococcus are the dominant photosynthetic organisms in the open ocean, contributing to a significant proportion of the earth’s biomass. Despite the excess sequence coverage of the Nitrosomonas, several genomic structural features made circularization a great deal more difficult than for the two cyanobacterial genomes. With these finished genomes, complete annotation and analysis (including comparative analysis) may help elucidate the pathways relevant to understanding the physiological and genetic controls of photosynthesis, nitrogen fixation and carbon cycling.
This work was performed under the auspices of the U. S. Department of Energy by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48.
150. Cloning, Expression, Purification and Initial Characterization of a ThreeHeme Cytochrome from Geobacter sulfurreducens Yuri Y. Londer, P. Raj Pokkuluri, William C. Long, and Marianne Schiffer
Biosciences Division, Argonne National Laboratory, Argonne, IL 60439
Multiheme cytochrome c proteins have been shown to exhibit a metal reductase activity, which is of great environmental interest, especially in bioremediation of contaminated sites. Geobacter sulfurreducens is one of a family of microorganisms that oxidize organic compounds using Fe(III) or other metals as terminal electron acceptors. We cloned a gene encoding a three-heme 9.6 kDa cytochrome from G. sulfurreducens believed to be involved in metal reduction (1)and expressed it in E. coli together with cytochrome c maturation gene cluster ccmABCDEFGH on a separate plasmid (2). We designed two different expression systems for the expression and correct post-translational processing, under control of T7 and lac promoters. We found that N-terminal His-tag is detrimental for proper maturation, where all three hemes are incorporated into the protein. We also established a method for purification of the mature form and species with fewer heme groups. The pure protein has the same molecular weight and displays the same spectra, both in reduced and in oxidized forms, as the protein isolated from G. sulfurreducens. Crystals of the recombinant protein were obtained and initial structure determination is under way. This work is a part of an ongoing collaboration with Prof. D. R. Lovley’s group at University of Massachusetts.
151. Microbial Metal and Metalloid Metabolism and Beyond Lynda B. M. Ellis, Larry P. Wackett, Wenjun Kang, Bo Hou, and Tony Dodge
University of Minnesota
Microbial functional genomics is faced with an ever-growing list of genes that are labeled “unknown” due to lack of knowledge about their function. The majority of microbial genes encode enzymes. Enzymes are the catalysts of metabolism: catabolism, anabolism, stress responses, and many other cell functions. A major problem facing microbial functional genomics is the wide breadth of microbial metabolism, much of which remains undiscovered. The breadth of microbial metabolism has been surveyed by the PIs and represented according to reaction types on the University of Minnesota Biocatalysis/ Biodegradation Database (UM-BBD): http://umbbd.ahc.umn.edu/
The database depicts metabolism of 50 chemical functional groups, representing most current knowledge. At least twice that number might be metabolized by microbes. Thus, 50% of the unique biochemical reactions catalyzed by microbes could remain undiscovered. Many genes with unknown function, including conserved hypothetical genes, encode functions yet undiscovered. This gap will be partly filled by the current project. The UM-BBD will be greatly expanded as a resource for microbial functional genomics, adding information on biotransformations of metals, metalloids and metal chelators and toxic organics. Two relevant lists are all present UM-BBD pathways: http://umbbd.ahc.umn.edu:8015/umbbd/ servlet/pageservlet?ptype=allpathways and all present UM-BBD metal, metalloid and metal chelator pathways: http://umbbd.ahc.umn.edu/ metals.html
This project was initiated with a meeting of its International Advisory Board in late September, 2001. This productive meeting was the start of several important future collaborations on computational and experimental work.
Computational methods will be developed to predict microbial metabolism that is not yet discovered. A concentrated effort to discover new microbial metabolism will be conducted, focused on metabolism of direct interest to DOE: the transformation of metals, metalloids, organometallics and toxic organics; precisely the type of metabolism that has been characterized most poorly to date. These studies will directly impact functional genomic analysis of DOE-relevant genomes.
152. A Potential Thermobifida fusca Xyloglucan Degrading Operon Diana Irwin1, Mark Cheng1, Bosong Xiang2, and David B. Wilson1
1Molecular Biology and Genetics and 2Chemistry and Chemical Biology, Cornell University
The annotated genome of Thermobifida fusca contains eight potential cellulase genes. Six of these had been previously cloned and sequenced in our laboratory. We subcloned one of the additional genes, contig 40-gene 27, which encoded a glycosyl hydrolase family 74 catalytic domain followed by a family II cellulose binding domain. The expressed and purified protein had low activity on carboxymethyl cellulose and amorphous cellulose, but high activity on xyloglucan. The adjacent upstream gene, contig 40-gene26, encodes a potential alpha-xylosidase gene, suggesting that this region contains a xyloglucan degrading operon. It is interesting that the gene for Cel9B is close by and it is the only other cellulase that has activity on xyloglucan. Time dependent NMR studies of the products of Cel74A hydrolysis showed that this enzyme uses an inverting mechanism, which would be expected to be used by all family 74 enzymes.
153. Proteome Flux in Photosynthesis and Respiration Mutants of Synechocystis sp. PCC 6803 Julian P. Whitelegge1, Kym F. Faull1, Robby Roberson2, and Wim Vermaas3
1The Pasarow Mass Spectrometry Laboratory, Departments of Psychiatry
and Biobehavioral Sciences, Chemistry and Biochemistry and the Neuropsychiatric
The availability of complete genome data delivers the potential to identify isolated proteins based upon coincidence of experimental mass data from fragments of polypeptide chain with hypothetical datasets calculated based upon translations of genomic sequences. In order to understand the interaction and control networks of proteins involved in photosynthesis and respiration in the cyanobacterium Synechocystis sp. PCC 6803, we are measuring changes in protein expression in populations of cells placed under specific experimental treatments. Early experiments are focusing upon mutants where either Photosystems 1 or 2 are completely knocked out. Two different strategies are being employed in order to fully characterize changes in the proteome. Firstly, 2D-electrophoresis provides a simple way to visualize many of the more abundant proteins of the cell and fluxes of abundance, as well as post-translational modifications that alter mobility in either isoelectric focusing or SDS-PAGE. Secondly, intact protein mass profiles generated by liquid chromatography – mass spectrometry (LC-MS) are used to define the native covalent state of a gene product and heterogeneity associated with it. Moreover, this latter option provides the ability to monitor subtle covalent modifications that are undetectable in 2D-gel systems providing a valuable alternative technology for proteomics. Subfractionation techniques will be applied to monitor less abundant members of the proteome and integrate with parallel studies of ultrastructure and metabolism.
154. Modification of the IrrE Protein Sensitizes Deinococcus radiodurans R1 to the Lethal Effects of UV and Ionizing Radiation Ashlee M. Earl and John R. Battista
Department of Biological Sciences, Louisiana State University and A & M College, Baton Rouge, LA 70803
IRS24 is a strain of Deinococcus radiodurans carrying mutations in two loci, uvrA and irrE, rendering it sensitive to the lethal effects of UV and ionizing radiation. These sensitivities can be reversed by introducing the wild type irrE allele back into IRS24 via natural transformation. The mutation was localized to a 970bp region containing one putative open reading frame (ORF), DR0167, and 179bp of sequence upstream. Subsequent sequence analysis of the irrE allele in IRS24 revealed a transition mutation at codon 111 of DR0167 resulting in an arginine to cysteine amino acid substitution. DR0167 was also inactivated by transposon mutagenesis in the wild type strain, R1. The insertion mutant has a more pronounced sensitivity to both UV and ionizing radiation suggesting that the point mutant has some activity. Blast search analysis of DR0167 reveals only minimal similarity to proteins currently available in the databases. A “weak” helix-turn-helix (HTH) motif was identified within the protein that may indicate a capacity to bind DNA and, perhaps, a potential role for IrrE in gene regulation. In order to test whether the mutation in DR0167 causes a regulatory deficiency we examined the pattern of transcription after applying ionizing radiation, comparing the irrE mutant and its parent using DNA microarray technology.
155. The Genome of a White Rot Fungus: How to Eat Dead Wood Nicholas Putnam1,2, Jarrod Chapman1,2, Susan Lucas1, Luis Larrondo3, Maarten Gelpke1,2, Kevin Helfenbein1, Jeff Boore1, Randy Berka4, Doug Hyatt5, Frank Larimer5, Dan Cullen3, Paul Predki1, Trevor Hawkins1, and Dan Rokhsar1,2
1U.S DOE Joint Genome Institute, Walnut Creek, CA 94598
White rot fungi produce a suite of unique extracellular oxidative enzymes that degrade lignin, a complex aromatic polymer that is a major component of wood, as well as related compounds found in explosive contaminated materials, pesticides, and toxic wastes. To elucidate the genomic toolkit of these fungi, we have sequenced the thirty million base-pair genome of Phanerochaete chrysosporium to high draft using a whole genome shotgun method, making it the first basidiomycete to be sequenced. Assembly of the sequence fragments was carried out using a newly developed algorithm that self-consistently incorporates paired-end information and provides a suite of analysis tools for large scale assemblies. We present an analysis of the P. chrysosporium genome, including the major families of secreted enzymes that characterize the white rot fungi, analysis of its mitochondrial genome, and phylogenetic comparisons with more distantly related fungi, animals, and plants.
This work was performed under the auspices of the U.S. Department of Energy, Office of Biological and Environmental Research, by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, the Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098, and the Los Alamos National Laboratory under contract No. W-7405-ENG-36.
156. Metabolic Pathway Elucidation for Microbial Genomes Imran Shah, Ronald Taylor, and Shilpa Rao
University of Colorado School of Medicine
The goal of this work is to develop predictive computational tools for elucidating microbial metabolic pathways. Metabolic inference is becoming increasingly feasible with the availability of large amounts biochemical data. Our system consists of three main modules: (i)a biochemical knowledgebase that integrates data from molecules to pathways and supports deductive inference, (ii)a predictive tool that aids in automated assignment of catalytic functions to putative proteins, and (iii)a pathway synthesis algorithm that generates pathways from catalytic function assignments. We are using this system to analyze the metabolic pathways using whole microbial genomic data.
157. Annotation of Shewanella oneidensis MR-1 from a Metabolic and Protein-Family View Monica Riley and Margrethe H. Serres
Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543
Since Shewanella oneidensis MR-1 (formerly Shewanella putrefaciens MR-1) first was isolated in 1988, experimental biochemical studies have been aimed at understanding its interesting energy metabolism and its ability to use metal ions as both electron donors and acceptors. The genome of Shewanella oneidensis MR-1 has recently been sequenced by TIGR opening the door for full genomic analysis. We are starting the work of determining the complete metabolic and energy transfer capabilities in Shewanella oneidensis MR-1. Sequence similar proteins to E. coli K-12 and over 40 other genomes are identified using DARWIN analysis. EcoCyc and MetaCyc are initial sources for metabolic pathways. Based on amino acid alignments of at least 83 amino acids, 57% of the Shewanella proteins have sequence similar matches to E. coli at a similarity distance of <200 PAM units. Putative functions can be assigned to 51% of the proteins based on matches to E. coli alone. Initial analysis of the metabolic pathways shows that Shewanella contains sequence similar proteins to a majority of E. coli proteins involved in energy metabolism, building block biosynthesis, and intermediate metabolism, but some functions are more closely related to those of other organisms. Details will be presented. Proteins of Shewanella oneidensis MR-1 are grouped into sequence related groups representing paralogous proteins within the Shewanella genome, presumed to have arisen through duplication and divergence either in the Shewanella genome or in its ancestors. These groups provide a source for annotation of gene function as well as studying evolution of functions in the organism. Structural predictions for the encoded proteins are in process and will be used in the annotation procedure as well as in protein family studies.
158. Modeling DNA repair in Deinococcus radiodurans Shwetal S. Patel and Jeremy S. Edwards
Chemical Engineering, University of Delaware
We are developing novel computational tools to analyze the DNA repair capabilities of Deinococcus radiodurans and their relationship to the metabolic capabilities of this organism. Such tools will be extremely useful in providing the necessary insights required to metabolically engineer D. radiodurans strains capable of growing under nutrient poor conditions and yet possessing extraordinary DNA repair capabilities. We are moving towards this ultimate goal along two directions. The first involves the construction of a database for the automated construction of metabolic flux balance models. We will then apply flux balance analysis to study the metabolic capabilities of D. radiodurans and identify the optimal growth characteristics under different conditions. Additionally, we will study the metabolic pathway structure of D. radiodurans to comprehensively examine the metabolic repertoire of D. radiodurans and elucidate the regulatory structure of D. radiodurans. The second is concerned with the development of dynamic models of the known DNA repair pathways in D. radiodurans. The structure of these dynamic models will evolve through critical tests of the hypothesis that the observed DNA repair capabilities of D. radiodurans are due solely to known mechanisms. These dynamic models will be used to compute the metabolic flux requirements during DNA repair. This will provide the critical link between the metabolic and DNA repair capabilities in D. radiodurans.
In this presentation, we will discuss the current state of our work. In particular, we will discuss a mathematical model to describe the pathway for nucleotide excision repair. Taken together, our analysis will provide valuable information for the metabolic engineering of D. radiodurans strains for bioremediation, and our work will significantly contribute to the growing fields of bioinformatics, computational biology, functional genomics and DNA repair.
159. A Novel Combinatorial Biology Method to Functionally Characterize Microbial ORFs Diane J. Rodi and Lee Makowski
Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439
This project applies a novel approach to genome-wide identification of small molecule binding proteins. Preliminary results demonstrated that the similarity between the sequence of a protein and the sequences of affinity-selected, phage-displayed peptides are predictive for protein binding to a small molecule ligand. Affinity-selected peptides provide information analogous to that of a consensus-binding sequence, and can be used to identify ligand-binding sites. Libraries of phage-displayed peptides are being screened for affinity to common metabolites and other small molecules with the goal of applying the affinity-selected sequences to genome-wide identification of proteins that have a high probability of binding to the screened ligands. Our initial experiments have involved affinity selection of ATP-binding peptides. Details of the selection process have been analyzed through the use of 4 different sets of experimental conditions in order to optimize selection. A comprehensive analysis of the sequences of peptides that contact ATP in ATP-binding proteins whose three-dimensional structures are known has been carried out to provide a basis for analysis of the ATP-selected peptides. Detailed informatic analysis has been used to identify significant and informative differences between the sequences of ATP-binding peptides and the sequences of peptides that contact ATP in ATP-binding proteins. A comprehensive analysis of these sequences is providing insights into the process of molecular recognition and the way ATP interacts with proteins.
160. Annotation of Draft Microbial Genomes Frank W. Larimer, Loren Hauser, Miriam Land, Doug Hyatt, Manesh Shah, Philip LoCascio, Edward C. Uberbacher, and Inna Vokler
Oak Ridge National Laboratory
A draft analysis pipeline has been constructed to provide annotation for the microbial sequencing projects being carried out at the Joint Genome Institute. The pipeline was applied to annotating the 15 genomes sequenced during the October 2000 Microbe Month effort; an additional ~30 genomes are anticipated to be processed as they become available in early 2002. Multiple gene callers (Generation, Glimmer and Critica) are used to construct a candidate gene model set. The conceptual translations of these gene models are used to generate similarity search results and protein family relationships; from these results a metabolic framework is constructed and functional roles are assigned. Simple repeats, complex repeats, tRNA genes and other structural RNA genes are also identified. Annotation summaries are made available through the JGI Microbial Sequencing web site; in addition, draft results are being integrated into the interactive display schemes of the Genome Channel/Catalog. Extensive use of high-performance computational tools has enabled rapid processing of genomes in batch. As of this writing, 22 genomes, comprising over 93 million bp of sequence, in ~4000 contigs have been processed to generate ~85,000 candidate peptide translations.
161. Annotation of Microbial Genomes Relevant to DOE’s Carbon Management and Sequestration Program F. Larimer1, L. Hauser1, M. Land1, D. Hyatt1, M. Shah1, S. Stilwagen2, P. Predki2, D. Arp3, A. Hooper4, S. Chisholm5, G. Rocap6, B. Palenik7, J. Waterbury8, R. Atlas9, J. Meeks10, C. Harwood11, R. Tabita12, P. Chain13, and J. Lamerdin13
1Oak Ridge National Laboratory, Oak Ridge, TN
A diverse group of autotrophic microorganisms have been sequenced to further fundamental research into carbon management topics that would enable a reduction or slowed growth of the atmospheric concentration of carbon dioxide; potential routes include augmenting the natural carbon cycle by identifying ways to enhance carbon sequestration in the terrestrial biosphere through CO2 removal from the atmosphere and storage in biomass and soils, and through evaluating the potential for increased carbon sequestration in the open oceans The aim of this research is to improve upon our rather rudimentary understanding of how carbon is used and stored in the biosphere. By systematic analysis of each genome, we hope to identify specialized nutrient uptake systems, pathways that contribute to or regulate nitrogen utilization, carbon cycling and photosynthesis.
The target genomes comprise a diverse group of autotrophs that are significant in their respective ecosystems and contribute materially to cycling of atmospheric gases. Six genomes are being examined: three are marine cyanobacteria, Prochlorococcus marinus ecotypes MED4 and MIT 9313, and Synechococcus sp. WH8102; Nostoc punctiforme, a nitrogen-fixing fresh-water cyanobacterium; Rhodopseudomonas palustris, a metabolically versatile anoxygenic photobacterium; and Nitrosomonas europaea, an ammonia-oxidizing beta-proteobacterium. The three marine cyanobacteria and R. palustris are currently undergoing final annotation; N. europaea is at closure and N. punctiforme is in finishing.
These genomes comprise an extensive resource for comparative genomics: the cyanobacterial genomes, together with completed and ongoing cyanobacterial sequencing elsewhere, represent the first opportunity to deeply examine this form of photoautotrophy; R. palustris is extensively informed by the recently completed Caulobacter crescentus, Sinorhizobium meliloti and Mesorhizobium loti genomes, as well as the larger contiguous portions of the draft Rhodobacter sphaeroides genome, expanding the alpha-proteobacterial group.
(Research supported by the Office of Biological and Environmental Research, USDOE under contract number DE-AC05-00OR22725 with Oak Ridge National Laboratory, managed by UT-Battelle, LLC)
162. A Genome-Wide Search for Archaeal Promoter Elements Enhu Li1, Aaron A. Best1, Gretchen M. Colon2, Claudia I. Reich1, and Gary J. Olsen1
1University of Illinois at Urbana-Champaign
The archaeal basal transcription system is a simpler version of the eukaryotic system, having a single RNA polymerase (RNAP) and only two general basal transcription factors: TATA-binding protein (TBP) and transcription factor B (TFB). These factors bind specific promoter elements and recruit RNAP. Though consensus promoter elements and basal transcription factors have been identified in Archaea, it remains unclear how transcription is (i) initiated in the absence of canonical promoter elements or (ii) regulated.
We have adopted an iterative, genome-wide strategy to identify promoter elements in Archaea, using Methanococcus jannaschii as a model system. The strategy has isolated and identified 15 of the 23 predicted promoter regions for tRNA transcripts. Alignment of isolated tRNA promoters reveals near-consensus TATA-elements and TFB recognition elements (BREs) located within 100 nucleotides (nt) of the tRNA coding sequences. A third conserved element, possibly serving as archaeal Initiator, is located ca. 21 nt downstream of the TATA-element in most of the isolated tRNA promoters. Binding of TBP and TFB to the predicted promoter elements was confirmed by DNase I footprinting. The eight remaining tRNA promoters have been characterized by a targeted approach, and analyses reveal that three of these differ significantly from the consensus sequences. Electrophoretic mobility shift assays reveal that promoter elements deviating from the consensus are bound by TBP/TFB with lower affinities than promoters exhibiting the canonical pattern. In addition, a correlation between tRNA promoter strength and predicted codon usage was observed – promoters exhibiting a high degree of similarity to consensus sequences drive expression of tRNAs with correspondingly high codon usage. Experiments are currently underway to validate these observed trends. Generally, the search strategy selected strong tRNA promoters and some strong protein promoters. However, promoters with lower affinity for TBP and TFB have also been isolated, suggesting that this strategy will be useful in the identification of novel and/or sub-optimal promoter elements.
In addition to identification of archaeal promoters, we have addressed questions surrounding archaeal RNAP (i) structure and (ii) recruitment to promoters using in vivo and in vitro protein-protein interaction methods. (i) Archaeal RNAP subunit composition is similar to that seen in eukaryotic RNAPs. We have demonstrated that archaeal and eukaryal RNAPs adopt similar subunit architectures, extending evidence of homology from the sequence level to quaternary structure interactions. (ii) Recruitment of archaeal RNAP to canonical promoters occurs through interactions between TFB and specific RNAP subunits. We have identified subunits of RNAP that contact TFB and propose a model for the DNA/TBP/TFB/RNAP transcription initiation complex.
163. New Markov Model Approaches to Deciphering Microbial Genome Function John M. Logsdon, Jr.1, Mark. A. Ragan2, and Mark Borodovsky3
1Department of Biology, Emory University, Atlanta, GA 30322
Upon development of efficient algorithms for gene finding in prokaryotic genomes it was observed that there are hundreds of genes that escape confident prediction unless special efforts are taken. The most interesting genes—often also the most difficult to predict—are those atypical genes whose DNA sequence features deviate strongly from the ‘typical’ ones. We have begun efforts to improve accuracy of predicting atypical genes in prokaryotes by Markov and Hidden Markov model based algorithms, such as GeneMark-Genesis and GeneMark.hmm. An important goal of this project will be the comparison of the predicted sets of atypical genes with sets indicated by alternative approaches (i.e. base composition bias methods). The algorithms will be extended to the analysis of genome draft sequences (nearly complete genomes) produced by high-throughput sequencing. Using atypical genes predicted by careful implementation of these new methods and estimated at hundreds per genome, a number of relevant biological questions will be addressed. Most importantly, we will use rigorous phylogenetic reconstruction methods to test the possibility that each atypical gene is a result of lateral (or horizontal) gene transfer (LGT), and, if so, from what lineage it was derived. We will, thus, identify the fraction of atypical genes that are bona fide LGTs, both across all genomes and within given genomes. With these analyses, we will be able to estimate the overall rates of LGT, particularly with respect to phylogenetic and/or ecological separation between donors and recipients. We will also determine, using comparative database methods, the putative functional roles of these atypical genes in order to better understand what types are most prone to be identified as atypical and, of those, which gene types are most likely to have been transferred between species. In particular, we will focus on those genes that have clear roles in the adaptation to or alterations of natural environments. Of all the genomes, the remaining, ‘typical’, set of genes (i.e. those which show little, if any, evidence of LGT) will be used to assess the validity of a phylogenetically stable ‘core’ of microbial genes. From these analyses, we plan to build a database of atypical genes and their inferred phylogenetic relationships for publicly available microbial genomes along with a web-based interface. This will allow its contents to be displayed and searched for specific genes, proteins and their phylogenetic relationships. These analyses and the resulting database will be a valuable resource for studies of microbial genome structural and functional evolution.
164. Genomic Plasticity in Ralstonia eutropha and Ralstonia pickettii: Evidence for Rapid Genomic Change and Adaptation T. L. Marsh, S-H Kim, N. M. Isaacs, S. Eichorst, and K. Konstantinidis
Michigan State University, Center for Microbial Ecology and Department of Microbiology
We have begun an analysis on genomic plasticity in the genus Ralstonia using recently isolated strains of R. eutropha and R. pickettii. The former served as the ancestral strain in a long-term evolution experiment in which eighteen independent lineages were propagated for 1000 generations under two different environmental conditions. Dramatic changes in both phenotype and genotype have been observed in the evolved lineages including large deletions to the genome. These deletions are being analyzed with an eye to identifying apparent preferred pathways in genomic degeneration. Regarding R. pickettii, 20 strains have been isolated from a 20 cm (depth) core of lake sediment contaminated with high concentrations of copper. All of these isolates are resistant to high levels of copper as well as several other heavy metals. The isolates display substantial differences in REP-PCR profiles, pulse field gel patterns, and plasmid content, suggesting significant genomic plasticity within a relatively small habitat volume. We report here on the sequence of a small plasmid detected in several R. pickettii isolates.
165. Lateral Gene Transfer and the History of Bacterial Genomes Howard Ochman
Department of Ecology and Evolutionary Biology University of Arizona Tucson, Arizona 85721
Deriving meaningful information from complete genomes depends upon the comparisons between sequences. Therefore, an evolutionary framework is required for all stages of genome analysis and interpretation. We are using universally distributed molecular characters to resolve the relationships among bacterial lineages in an attempt to determine the evolutionary history and degree of gene transfer among bacteria. The objectives of the proposed research are to use existing published and newly determined nucleotide sequences of a large set of universally distributed genes among bacteria of differing degrees of genetic relatedness and to address the several questions relating to the role of gene transfer in shaping bacterial genomes. In addition to supplying the information about the extent of gene transfer, this research serves three additional functions: (1)The set of conserved genes adopted for these studies will provide a new framework for the identification and classification of bacteria spanning all levels of genetic divergence. (2)Sequence information from a common set of genes will allow, for the first time, direct comparisons of the rates and patterns of nucleotide evolution within and among bacteria. (3)Analysis of a defined set of genes yields a rapid measure of genome dynamics, makes use of the rapidly increasing number of incomplete, unassembled or unannotated bacterial genomes, and can be used to direct the focus of new sequencing endeavors.
166. Physiomics Array: A Platform for Genome Research and Cultivation of Difficult-to-Cultivate Microorganisms Michel Marharbiz, William Holtz, Roger Howe, and Jay D. Keasling
Departments of Chemical Engineering and Electrical Engineering and Computer Science University of California Berkeley, CA 94720
The sequences of a number of microbial genomes have recently been completed or will be completed shortly. Many of these organisms contain novel genes, the function for which is not known. Further, many of these organisms have novel characteristics —such as the ability to transform abundant biopolymers into biofuels or the ability to remediate environmental contaminants—that make them important for DOE purposes.
The large cultivation parameter space that the researcher needs to explore to determine the function of novel genes, the optimal culture conditions for a desired bioconversion, or the most appropriate cultivation conditions for a previously unculturable microorganism is extensive. Given the large numbers of organisms that have been sequenced, unknown genes in each of those organisms, and previously unculturable organisms, a high-throughput cultivation device would allow one to explore cultivation parameter space quickly.
We are developing an integrated research program to develop a high-throughput physiomics array to assess the effects of changes in culture parameters or environmental contaminants on cell physiology and gene expression or to cultivate difficult-to-cultivate or previously unculturable microorganisms. The specific aims are as follows:
167. Optical Mapping: New Technologies and Applications David C. Schwartz, Shiguo Zhou, Ana Garic-Stankovic, Alex Lim, Eileen Dimalanta, Arvind Ramanathan, Tian Wu, Ossmat Azzam, Casey Lamers, Brian Lepore, Aaron Anderson, Michael Bechner, Erika Kvikstad, Natalie Kaech, Andrew Kile, Jessica Severin, Rodney Runnheim, Danile Forrest, Christopher Churas, Galex Yen, Jonathan Day, Bud Mishra, and Thomas Anantharaman
University of Wisconsin-Madison, Department of Chemistry, Department of Genetics, UW Biotechnology Center
Our laboratory has developed Optical Mapping, a system for the construction of ordered restriction maps from individual DNA molecules. Our work centers on the development of new systems for genome analysis, including Optical Mapping, which exploit novel macromolecular phenomena to answer important biological problems. These are built upon a complex mix of principles derived from multiple disciplines including chemistry, genetics, computer science, biochemistry, optics, surface science and micro/nanofabrication. Recently, “Shotgun” Optical Mapping was used to construct whole genome restriction maps of Escherichia coli O157:H7, Deinococcus radiodurans, and Plasmodium falciparum (the major causative agent of malarial disease) without the use of PCR, electrophoresis, or clones. Presently we are applying Shotgun Optical Mapping to the analysis of more complex genomes, including human and rice, as well as of numerous microorganisms, where our mapping efforts are offering new routes to understanding genome plasticity across closely related species. These efforts are also helping to facilitate the ongoing microbial sequencing projects at JGI, in terms of providing means for validation and aids for assembly. With the advent of a high-throughput Optical Mapping System, we are developing novel approaches for human association studies using a new class of genome markers that are designed to encompass SNPs (Single Nucleotide Polymorph-isms), yet reveal genome variation on a scale not previously discerned for large populations. Current thinking in the field is centered on the use of a limited number of SNPs to leverage the apparent state of linkage disequilibrium, which is indicative of a young species; however, current approaches based on chips or mass spectrometry are pendant on huge numbers of oligonucleotides. This requirement limits analysis to a series of discrete loci and renders such approaches inadequate for the assessment of a broad spectrum of genome variation motifs. This limitation of current systems used for large-scale association studies may neglect discovery of important factors contributing to complex traits. In this regard, haplotyping is emerging as the means to perform detailed analysis of mutations and is expected to play a major role in the emerging field of pharmacogenomics. The Optical Mapping platform is uniquely suited for haplotyping since analysis of single molecules allows for the unambiguous phasing of genetic markers within populations of unrelated individuals.
168. Spectroscopic Studies of Desulfovibrio desulfuricans Cytochrome c3 William H. Woodruff1, Judy D. Wall2, Robert J. Donohoe1, and Geoffrey B. West1
1Los Alamos National Laboratory
Desulfovibrio desulfuricans is a sulfate-reducing bacterium that also is able to reduce a variety of metals including Cr(VI) and U(VI). Reduction in vitro with hydrogen as electron donor is dependent on the four-heme periplasmic cytochrome c3, a broad-specificity redox protein. It is unknown whether cytochrome c3 acts as the proximate electron donor to the metal species, or whether it is simply an electron carrier in the respiratory redox network of this organism. We have undertaken characterization of cytochrome c3 by spectroscopic and other physical methods to establish the role of this protein in metal reduction. Resonance Raman results allow specific hemes and their redox states to be distinguished, and infrared results reveal the sidechain proton-transfer reactions that accompany the electron transfer steps. An allometric scaling model shows general correlations between genome length, average copy numbers of gene products, and bioenergetic capacity over a very large range of bacterial size.
169. Identification and Isolation of Active, Non-Cultured Bacteria for Genome Analysis Cheryl R. Kuske, Susan M. Barns, Ellie Redfield, and Leslie E. Sommerville
Bioscience Division, M888, Los Alamos National Laboratory
At least one third of the bacterial divisions identified to date have no cultured members. Non-cultured bacteria representing several bacterial divisions are widespread and potentially abundant in soils and other environments. For example, we have found that members of the Acidobacterium division are among the most abundant bacteria in some soils, yet we know almost nothing of their functions. The overall goals of our project are to determine the abundant and active members of the Acidobacterium division in pristine and contaminated soil and aquifer material using RT-PCR, 16S rRNA-targeted probes and in situ microscopy, and to collect cells of active, non-cultured groups by flow cytometry cell sorting. The pooled DNA of non-cultured bacteria isolated directly from the environment will be a valuable resource of genetic material for comparative analyses of conserved and novel gene families, and for targeted genome sequencing. Work in the last year has focused on technical advances in hybridization and flow cytometry separation of bacterial cells from natural environments. To apply these techniques to analysis of cells from soil and to enrich for bacterial groups of interest, we are comparing the bacterial diversity found in pools of bacterial cells fractionated from soil with that of the parent environment. We have also begun work on RT-PCR methods for analysis of active Acidobacterium division members from contaminated and pristine soils.
170. Assembly of Microbial Sub-Genomes from Beneath a Leaking High-Level Radioactive Waste Tank Fred Brockman, Margie Romine, Greg Newton, Amber Alford, Shu-mei Li, Jim Fredrickson, Kristen Kadner, Paul Richardson, and Paul Predki
Pacific Northwest National Laboratory and DOE Joint Genome Institute
Our goal is to demonstrate the ability to obtain 1 to 2 Mbp of genetically linked sequence (a sub-genome) from microorganisms that can not be grown in pure culture by direct cloning of DNA from environmental enrichments and high throughput sequencing of BAC ends. Simulations indicate that paired-end sequencing of approximately 5000 BACs from a well-represented library where 5 to 10% of the bacterial community is composed of an “archetypal species” (a single species or a closely related group of species) could produce a contig of 1-2 Mbp before chromosome walking fails. Subsurface vadose zone (aerobic) sediment samples—representing the most radioactive sediment samples ever taken at the DOE Hanford Site in Washington state—are the focus for this demonstration. Samples contained up to 50 microCuries of Cesium-137 per gram sediment, other radionuclides at nano- and picoCurie levels, and pH’s to 9.8. Samples in which no microorganisms could be grown on solid media but which produced growth in pH 10 and/or 50 degree C liquid media enrichments were selected for study. In the first several months of the project, the microbial community in these enrichments and subsequent transfers have been screened for species that comprise >5% of the community and for bacterial divisions with few or zero cultured representatives, as a basis for determining appropriate community(ies) for BAC library construction.
171. The Marine Environment from a Cyanobacterial Perspective Brian Palenik1, Ian Paulsen2, Bianca Brahamsha1, Rebecca Langlois1, and John Waterbury3
1Scripps Institution of Oceanography, University of California,
The genome sequence of the marine cyanobacterium Synechococcus strain WH8102 is nearly completed. This microorganism was chosen because cyanobacteria similar to WH8102 are ubiquitous and significant primary producers in oligotrophic marine environments. In addition this strain possesses a unique type of prokaryotic motility and is amenable to genetic manipulation. The genome is estimated to be 2.7 Mb with approximately 2390 ORFS. The transporter complement of Synechococcus WH8102 was analyzed by screening its genome against a database of known and putative transporters by BLAST and HMM-based analyses. Approximately eighty transport systems were identified comprising 130+ genes. Comparison with the transporter complement of other complete genomes indicated that WH8102 has an emphasis on transport of inorganic anions, in particular with multiple transporters for nitrate, sulfate and chloride. In terms of organic nutrients it is predicted to transport a variety of amino acids and a limited number of sugars. The transporters and other activities of the cell are coordinated by a surprisingly small number of two component regulatory systems compared to the freshwater cyanobacterium Synechocystis PCC6803. Ultimately the WH8102 genome will provide us with a better understanding of how cyanobacteria perceive and respond to the marine environment.
172. Metagenomic Analysis of Uncultured Cytophaga and Beta-1,4 Glycanases in Marine Consortia David L. Kirchman and Matthew T. Cottrell
College of Marine Studies, University of Delaware
Culture-independent studies have shown that microbial consortia in natural environments are incredibly diverse and are dominated by bacteria and archaea substantially different from microbes maintained in pure laboratory cultures. Recent studies indicate that previous culture-independent studies using PCR-based methods have largely overlook an important group of uncultured bacteria, the Cytophagales. These bacteria appear to be abundant in the oceans and probably other oxic environments. We hypothesize that the key to understanding consortia and their function in organic matter mineralization in oxic environments is to focus on uncultured Cytophagales and their genes encoding endoglycanases. This poster will summarize our progress in understanding uncultured Cytophagales and our plans for our new DOE-supported metagenomic project. We have been using an approach that combines microautoradiography with fluorescence in situ hybridization (Micro-FISH) to examine which bacterial groups are responsible for using naturally-occurring organic material. As we had hypothesized based on the work with cultured representatives, uncultured Cytophagales appear to dominate use of protein and chitin in the Delaware Bay and coastal waters. Perhaps as a result of protein and chitin inputs, uncultured Cytophagales are abundant through the Delaware estuary. For our new project, we intend to construct two BAC libraries with DNA directly (no PCR) from uncultured microbial consortia found in a coastal marine environment. Microbes on macroscopic aggregates will be one target for our clone libraries. These organic aggregates, which are important in carbon transport and storage in the oceans, harbor dense assemblages of Cytophagales. The libraries will be screened for 16S rRNA genes, for cellulase and chitinase-active clones, and for clones bearing genes of these enzymes. The DNA probes for screening the libraries will be constructed from the sequence data now emerging from the C. hutchinsonii project which has already found about 15 presumed endoglucanases (mainly cellulases). The proposed work should reveal much about a neglected microbial group that appears to dominate microbial assemblages in oxic environments. Ultimately, the data will be used to improve models of carbon cycles and storage in the oceans and other environments where Cytophagales are abundant and ecologically important.
173. Rational Design and Application of DNA Signatures P. Scott White, John Nolan, Rich Okinaka, Paul Jackson, and Paul Keim
Bioscience Division, Los Alamos National Laboratory and Department of Microbiology, Northern Arizona University
With the rapid accumulation of direct sequence data for a variety of pathogenic organisms, the development and application of pathogen “signatures” is undergoing a paradigm shift from empirical development of signatures using detection platform-specific methods, to rationally designed signatures that can be assessed in a platform-independent manner. Thus, DNA sequence is the “signature”, and the signatures have precise phylogenetic and functional significance. We are using phylogenetic and functional analysis, combined with a rapid method for direct sequence analysis using microsphere arrays and flow cytometry, to exploit the information contained in DNA sequence from multiple genetic loci. Such Multi-Locus Sequence Typing (or MLST) has the potential to revolutionize DNA-based analysis in applications ranging from biological point detection to water and food safety. Currently we are focusing on DNA sequence analysis tools (bioinformatics), the design of DNA primers and probes (reagent development), and protocol development for both laboratory and field applications.
174. Pathogen Detection: Successes and Limitations of TaqMan® PCR and Limitations of TaqMan® PCR
Shea N. Gardner, Thomas A. Kuczmarski, Elizabeth A. Vitalis, and Tom Slezak
Lawrence Livermore National Laboratory
Recent events illustrate the imperative to rapidly and accurately detect and identify pathogens during disease outbreaks, whether they are natural or engineered. Detection techniques must be both species-wide (capable of detecting all known strains of a given species) and species specific. Fluorogenic probe-based PCR assays (TaqMan®; Perkin Elmer Corp./Applied Biosystems, Foster City, Calif.) may be a sensitive, fast method to identify species in which the genome is conserved among strains, for example, in West Nile/Kunjin virus. For species such as Venezuelan Equine Encephalitis and HIV, however, the strains are highly divergent. We use computational methods to show that 6-10 TaqMan® primer/probe sequences, or signatures, are needed to ensure that all strains will be detected, an unfeasible number considering the cost of TaqMan® probes. We compare TaqMan® with the alternate nucleic acid based detection techniques of microarray, chip and bead technologies in terms of sensitivity, speed, and cost.
175. Sequencing and Analysis of the Genome of Carboxydothermus hydrogenoformans, a CO-Utilizing, Hydrogen Producing Thermophile J. A. Eisen1, F. T. Robb2, J. Gonzalez2, T. Sokolova3, L. J. Tallon1, K. Jones1, A. S. Durkin2, and C. M. Fraser1
1The Institute for Genomic Research, Rockville, MD
Carboxydothermus hydrogenoformans is an extreme thermophilic bacterium, growing on CO as the only carbon and energy source under strictly anaerobic conditions. Here we present an update on the progress of sequencing and analyzing the genome of this species. Preliminary analysis reveals that this species is clearly a low-GC gram-Positive bacteria in most aspects of its core biology. However, this species encodes many genes, in particular those likely involved in energy metabolism, that are more commonly found in distantly related thermophilic or methylotrophic bacteria and Archaea. Analysis of various features of the genome will be presented.
202. What the Genome of Rhodopseudomonas palustris Tells Us About the Biology of a Versatile Phototrophic Bacterium
Caroline S. Harwood
Department of Microbiology, University of Iowa, Iowa City, IA 42242
Rhodopseudomonas palustris is a very successful photosynthetic bacterium that can be found in virtually any temperate soil or water sample on earth. It is among the most metabolically versatile of known bacteria and has many alternative ways of acquiring carbon and nitrogen and of generating energy. It is also robust and able to survive for long periods of time with very few nutrients. Each of these aspects of the biology of Rps. palustris is reflected in its 5.49 Mb genome. Rps. palustris has a large cluster of photosynthesis genes and a collection of additional genes that encode light-responsive proteins. It has genes for the catabolism of diverse kinds of carbon sources, including lignin monomers, sugars, fatty acids and dicarboxylic acids. It encodes two different carbon dioxide fixation enzymes and three different nitrogen fixation enzymes, each with a different transition metal at its active site. It has genes to carry out anaerobic respiration using nitric oxide and nitrite as electron acceptors and it has genes for thiosulfate oxidation and hydrogen oxidation. It is obvious that an organism with this degree of metabolic versatility must have a great deal of traffic with its environment. This is reflected by a very large number of transport systems, especially transport systems for iron. Genes for at least seven multidrug resistance efflux pumps and the presence of a cluster of genes for polyketide biosynthesis may help explain why Rps. palustris survives so well in most soil and water environments.
207. Environmental Genomics and Microbial Ecology
Edward F. DeLong
Monterey Bay Aquarium Research Insitute, Moss Landing, CA
The complexity, variability and functional diversity within natural microbial assemblages can now be viewed in novel ways, using contemporary genome sequencing and analytical approaches. Genomic approaches provide equal access to genomes of naturally occurring microbes, circumventing the long-standing problem of the low cultivability of naturally occurring microbes. Large fragment cloning and screening techniques now can be used to archive and identify genome fragments of abundant, uncultured microbes. Genome structure, organization, and gene content can then be accessed via high throughput sequencing and comparative genome analyses. Natural genetic microvariation within and between populations can be examined in exquisite detail, at the genomic level. Additionally, large genome fragments serve as reagents for recombinant protein biosynthesis and characterization, providing insight into the biochemistry, physiology and function of indigenous microorganisms. Beyond individual microbes, multigenome libraries from the environment can be viewed en masse, and examined for patterns in functional, phylogenetic, or regulatory gene content and distribution. Holistic studies in 'population genomics' may well become a primary tool of microbiology, for comparing biotic patterns and processes within and between diverse microbial ecosystems. Conversely, microbial ecology can inform genome science, by providing the specific context and experimental systems ideal for characterizing the dynamics of genome evolution.
209. Microbial Defense Systems: Ecology, Evolution and Application
Margaret (Peg) Riley
Dept. of Ecology and Evolutionary Biology, Yale University
Microbes are engaged in a never-ending arms race. They produce a wealth of
biological weapons (toxins, antibiotics, bacteriocins, lysozymes, etc.) that
play a dominant role in mediating population- and community-level dynamics.
Each antimicrobial produced selects for a corresponding resistance mechanism.
As resistance invades, selection for a novel antimicrobial increases. Our studies
explore the molecular mechanisms involved in generating such extraordinary levels
of defense system and resistance diversity. We employ methods of experimental
evolution, surveys of natural populations, mathematical modeling and molecular
genetics to understand the evolution and ecology of these biological weapons
in the microbial world. As we lean how microbes engage in defensive strategies,
we apply this knowledge in the design and development of novel antimicrobials
for use in human health. The microbes are teaching us how to design and employ
antimicrobials that will prove to be more difficult and more costly to resist.
211. Gene Transfer: Past and Present
Claudia I. Reich, Carl R. Woese and Gary J. Olsen
Department of Microbiology University of Illinois at Urbana-Champaign
One of the major forces in the history of life has been lateral gene transfer,
and it is a primary interest of our laboratories. We will identify instances
of lateral gene transfer by phylogenetic analyses of gene families, by analyses
of codon bias, and by comparisons of closely related genomes. We will use the
instances found to estimate the frequencies of recent transfer to diverse lineages.
We will compare lineages with relatively high gene transfer rates to those with
lower rates, looking for genomic, evolutionary and life-style correlates with
high acquisition rates. So far, most of the work on lateral gene transfer has
focused on the recipient lineages. We will use molecular phylogeny to attempt
to localize the donor lineages of ancient transfers and codon bias to identify
possible sources of recent transfers. In the case of ancient lateral transfers,
close relatives of the donor organism are not accessible, since they all lived
in the remote past. However, in the case of recent transfers it is likely that
a close relative of the donor lineage still exists, and still maintains a copy
of the gene. We will use DNA hybridization and/or the polymerase chain reaction
to test the presence of genes in candidate donor lineages. We will evaluate
the relative frequencies of gene transfer events from different donor lineages,
again with an eye toward possible life-style correlates (in the case of ancient
lineages, this can only be very approximate). Finally, the goal of greatest
interest, but also of greatest difficulty, is an analysis of the relationship
between gene transfer and biologically important events, such as the demarcation
of major lineages. This is a key area for theoretical understanding of microbial
evolution (and speciation), and there is a critical need for ideas to be solidly
anchored in real-world data.
212. Genomic Analysis of Geobacter Species Living in Pure Culture and in Subsurface Environments
Derek R. Lovley
Department of Microbiology, University of Massachusetts, Amherst, MA
Dissimilatory metal reduction shows promise as a strategy for the bioremediation
of subsurface environments contaminated with radioactive metals or organic compounds.
Dissimilatory metal reducers also provide a mechanism of harvesting energy from
waste organic matter in the form of electricity. Although there is a wide phylogenetic
diversity of dissimilatory metal-reducing microorganisms, molecular analysis
have demonstrated that microorganisms in the family Geobacteraceae are the predominant
metal-reducing microorganisms in a wide diversity of subsurface sediments. Genome-enabled
investigations of Geobacter species have demonstrated that their mechanisms
for accessing Fe(III) oxides and transferring electrons onto Fe(III) oxide surfaces
differs significantly from other well-studied Fe(III)-reducing microorganisms,
such as Shewanella and Geothrix. Geobacter sulfurreducens specifically expresses
genes for some c-type cytochromes when Fe(III) serves as the electron acceptor.
In some cases there is a direct correlation between the level of mRNA for a
c-type cytochrome gene and the rate of Fe(III) reduction in chemostat cultures.
Genetic analysis has demonstrated that some, but not all, c-type cytochromes
are required for Fe(III) reduction. Genetic and biochemical studies with G.
sulfurreducens and G. metallireducens have demonstrated that Geobacter species
specifically produce flagella and/or pili in response to growth on the insoluble
electron acceptors, Fe(III) and Mn(IV) oxides. Motile cells are chemotactic
to Fe(II) and Mn(II), providing a mechanism to locate Fe(III) and Mn(IV) oxides
under anaerobic conditions. These results suggest that G. metallireducens senses
when soluble electron acceptors are depleted and then synthesizes the appropriate
appendages to permit it to search for, and establish contact with, insoluble
Fe(III) or Mn(IV) oxides. This novel approach to the utilization of an insoluble
electron acceptor that may explain why Geobacter species predominate over other
Fe(III) oxide-reducing microorganisms in a wide variety of sedimentary environments.
The predominance of Geobacteraceae in subsurface environments makes it possible
to study the genomic DNA of subsurface Geobacteraceae and compare the genetic
potential of subsurface Geobacteraceae with those being intensively studied
in the laboratory. Analysis of DNA extracted from a subsurface sediment from
a uranium mine tailings site, in which dissimilatory metal reduction had been
stimulated, indicated that some genes of the Geobacteraceae living in the sediments,
such as those involved in carbon metabolism, were highly homologous to the genes
found in the genome of Geobacter sulfurreducens. However, 40% of the open-reading
frames (ORFs) in the environmental Geobacteraceae genomic DNA had unknown functions
and did not closely match any of the known ORFs in the G. sulfurreducens genome.
To date, these studies suggest that the Geobacteraceae responsible for Fe(III)
and U(VI) reduction in uranium-contaminated subsurface sediments may have a
genetic potential that is similar, but not identical to those already available
in culture. Future studies involving more intensive functional genomics analysis
and in silico modeling of metabolism are expected to provide further insights
into the mechanisms controlling the growth and activity of Geobacteraceae during
bioremediation and energy harvesting in subsurface environments.
213. Application of Exploratory Data Analysis Techniques to Visualize Extremely Large Sequence Data Sets
George M. Garrity1 and Timothy G. Lilburn2
1Bergey's Manual Trust Department of Microbiology & Molecular Genetics, Michigan State University, East Lansing, MI 48824-1101, 2American Type Culture Collection, Manasas, VA
In compiling a comprehensive outline of the validly named prokaryotic taxa
to be included in the 2nd edition of Bergeys Manual of Systematic Bacteriology,
one of the more challenging problems has been visualizing the "biological
landscape". While phylogenetic trees can provide a satisfactory view of
the relationships among small numbers of strains, when applied to large data
sets (> 1000 sequences or OTUs) such graphs are no more useful than hierarchical
lists. Recently, we began experimenting with the use of exploratory data analysis
(EDA) techniques to view extremely large sequence data sets in an evolutionary
and organismal context. The tools we are developing use well understood, scalable
methods drawn from the field of multivariate analysis and the resulting models
were built using taxonomic and sequence data held by Bergeys Manual Trust
and the Ribosomal Database Project. To date, we have created a variety of stable
2-D and 3-D models that provide a comprehensible, one-page overview of the relationships
between thousands of sequences and that have proven quite useful in resolving
a variety of taxonomic problems. The methods we are developing are, however,
more broadly applicable. Preliminary work suggests that these tools may provide
new views of genomic data that will aid in predicting gene function, in identifying
horizontally transferred genes and in other genome-related tasks. An atlas of
the biological landscape enhanced with ecological, phenotypic or
other information will doubtless lead to novel hypotheses concerning the processes
and products of evolution.
214. Finishing/Investigating the Genomes of Prochlorococcus, Synechococcus, and Nitrosomonas: An Overview
P. Chain1, W. Regala1, L. Vergez1, S. Stilwagen2, F. Larimer3, D. Arp4, N. Hommes4, A. Hooper5, S. Chisholm6, G. Rocap7, B. Brahamsha8, B. Palenik8, and J. Lamerdin1
1Lawrence Livermore National Laboratory, Livermore, CA
The output of sequence data from sequencing centers, such as the DOE's Joint Genome Institute, has been rising at an exponential rate for the past decade or two. The increase in sequencing efficiency over the past few years has resulted in a bottleneck shift, from the accumulation of raw data to the finishing, annotation and analysis of genomes. The first two publications describing complete microbial genomes were reported in 1995. Only seven years later, there are approximately 60 complete, annotated microbial genomes available, along with published draft analyses of several multi-cellular eukaryotes. However, an even greater number of projects are either currently underway or are awaiting the finishing process, which provides a complete picture of the genome including contextual information, captures all the sequences missed in the draft phase, and adds a level of confidence to the genomic sequence. In support of the DOE's Carbon Sequestration and Management Program, we undertook the challenging task of finishing the genomes of three autotrophic bacteria which play unique roles in their soil and ocean ecosystems. The genomes of Prochlorococcus marinus MIT9313 and Synechococcus sp. WH8103, two cyanobacteria, had been drafted to 7-fold coverage by the JGI, while Nitrosomonas europaea sp. Schmidt was at near 14-fold coverage. Nitrosomonas europaea is an obligate ammonia-oxidizing beta-proteobacteria that can meet its carbon requirements entirely through the fixation of carbon dioxide, while Prochlorococcus and Synechococcus are the dominant photosynthetic organisms in the open ocean, contributing to a significant proportion of the earth's biomass. Despite the excess sequence coverage of the Nitrosomonas, several genomic structural features made circularization a great deal more difficult than for the two cyanobacterial genomes. With these finished genomes, complete annotation and analysis (including comparative analysis) may help elucidate the pathways relevant to understanding the physiological and genetic controls of photosynthesis, nitrogen fixation and carbon cycling. (This work was performed under the auspices of the U. S. Department of Energy by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48.)
219. Analyzing the Metagenome: Accessing the Uncultured Microbial World
University of Wisconsin-Madison
Microorganisms contribute to geochemical cycles, nutrient cycles, small molecule chemistry, and enter into symbioses with macroorganisms as well as with other microorganisms. Despite their diversity, centrality to life, and abundance on earth, microorganisms represent a substantial gap in our understanding of the biosphere because most knowledge of microorganisms has been derived from cultured organisms, and most microorganisms, representing vast phylogenetic diversity, are inaccessible by standard culturing. We have developed a strategy that we call "metagenomics" to capture the idea of studying the collective genomes of the organisms in an environment. We have initiated analysis of the metagenome of the soil by extracting DNA directly from soil, cloning it in large fragments in bacterial artificial chromosome (BAC) vectors, and characterizing the resulting clones. The metagenomic libraries have yielded novel phylogenetic information, genomic insights, and new small molecules, including a structurally new antibiotic with broad-spectrum activity. Reassembly of complete genomes from soil is a daunting challenge with current limitations in sequencing and bioinformatics as most soils likely contain thousands of species of bacteria. To attempt reassembly of a metagenome, we are exploring two other systems. One is the midgut of gypsy moth larvae, which contain bacteria of 10 phylotypes, six of which are culturable. The appeal of this community is that it is of moderate diversity, most of the species appear to be previously undescribed, and the gypsy moth midgut represents an extreme environment with a pH of 12. The second candidate for metagenome reconstruction is an acid mine drainage site. The advantages of this community are that it contains about 5 phylotypes of diverse affiliation, the genome of one member, Ferroplasma acidarmanus, is completely sequenced, and the organisms live in an extreme environment of pH 0-1.0. These simple communities will provide prototypes for analysis of more complex metagenomes.
The online presentation of this publication is a special feature of the Human Genome Project Information Web site.