R pal. The automated pipeline. Mass spec and proteomics. These phrases are used by ORNL researchers who probe microbes to determine what these "bugs" are made of and what drives them. The Institute for Genomic Research and the Department of Energy's Joint Genomics Institute (JGI) sequenced these microbes. They are among the 100 microbes annotated by an ORNL group of computational biologists led by Frank Larimer of ORNL's Life Sciences Division (LSD). This group identified and characterized most of these microorganisms' genes.
DOE is seeking more detailed information about the proteins encoded by these genes. Using a systems biology approach, ORNL researchers are trying to determine which microbial proteins, or groups of proteins called protein complexes, carry out a function of interest to DOE.
of great interest to DOE and ORNL is "R pal," short for Rhodopseudomonas
palustris. This bacterium, which can be grown in many different
ways, could possibly be manipulated to produce hydrogen efficiently
while fixing nitrogen or to take up carbon dioxide from the air, slowing
the buildup of a greenhouse gas.
The workhorse instrument in the pipeline is the mass spectrometer. "Mass spec" is considered the world's leading tool for "proteomics," which entails rapidly identifying and characterizing proteins and the changes they undergo—called post-translational modifications (PTMs)—when a microbe is grown differently or is exposed to a toxic material that could reduce its ability to render a desired service.
"This year we are focusing on highthroughput, automated analysis of protein complexes in a large format process so we can do many things at one time in a massively parallel way using mass spectrometers and microscopes," says Buchanan. "The concept is not to follow a biological pathway from beginning to end. Rather, we are 'jumping' on a microbe and trying to identify as many of its protein complexes as we can as fast as possible. Once we obtain the parts list, biologists can use it to figure out how the parts interact."
The ORNL pilot project led by Buchanan involves growing microbes in different ways with special tags; extracting their protein complexes; identifying and characterizing the protein complexes using mass spectrometers and imaging tools, such as fluorescent microscopes; and sending mass spectra and other data to bioinformaticians and computational biologists for interpretation. These specialists write algorithms, improve supercomputer codes, and annotate genome sequences. One goal might be to identify the R. palustris protein most involved in hydrogen production.
ORNL researchers hope the project will strengthen the Laboratory's effort to compete for one of the DOE Office of Science's proposed new genomics user facilities—the Molecular Machines Characterization and Imaging Facility. CSD's Greg Hurst and Bob Hettich anticipate that the facility will have at least 60 mass spectrometers to meet DOE's goals.
"We will need various methodologies to characterize the interactions of protein complexes with each other and with other components of bacterial cells," Hettich says. "A high-throughput pipeline will be anchored around mass spectrometry, but there will also be lower-throughput parallel lines, such as imaging and neutron scattering. These technologies will be very important for targeting specific pieces of information for these biological systems."
Growing R Pal
LSD's Biochemical Engineering Research Group grows masses of bacteria in various ways in bioreactors for use in research. "If R. palustris is grown so that it receives energy from light and carbon from organic molecules, it will produce hydrogen," says LSD director Brian Davison. "If, however, R. palustris is grown so it gets energy from light and carbon from carbon dioxide, R pal could be used to slow the buildup of atmospheric CO2."
Extracting Protein Complexes
LSD's Dale Pelletier and his colleagues perform molecular biology to induce bacteria to express proteins that are tagged, so that protein complexes can be fished out or imaged inside live cells. One trick Pelletier's group uses to fish protein complexes out of bacterial cells is what Buchanan calls "selective Velcro." Multiple copies of special genetic sequences are added to R pal cells reproduced at ORNL. Within each cell, a protein called a 6-histidine tag grows as an attachment to a protein complex. The 6-histidine protein has an affinity for nickel.
Upon disruption of the cells' membranes, affinity reagents made of beads coated with nickel are mixed with the cell contents. "The 6-histidine binds to the nickel," Pelletier says. "We fish out the beads and out come protein complexes."
The goal is to create a library of antibodies that individually pair with specific microbial proteins. The antibodies can be used to extract target proteins and their partners.
A Leading Analytical Tool
Pelletier's group purifies the protein complexes extracted from the R pal cells and hands them over to CSD's GTL mass spectrometry effort, led by Hurst. This group uses liquid chromatography-tandem mass spectrometry (LC-MS/MS) to identify and characterize microbial proteins and protein complexes.
effort focuses on two general types of measurements. For the first
approach, Hettich uses a Fourier transform ion cyclotron resonance
mass spectrometer to do "top-down" identification of intact proteins
in microbes. For the second approach, Nathan VerBerkmoes, a doctoral
candidate at the University of Tennessee—ORNL Graduate School
of Genome Science and Technology, has been a driving force in using
CSD's three ion trap mass spectrometers for "bottom-up" identification
of the components of proteins.
"Using LC-MS technology, we can identify a substantial portion of the R. palustris proteome," Hettich says. "In the more common bottom-up MS approach, the complex protein sample from R. palustris is digested with the protease trypsin, which selectively cuts all the proteins into smaller pieces called peptides. We identify the individual peptides by investigating their fragmentation using tandem MS, and then assemble the information to identify the original proteins present in the sample."
"First, we identify and catalog proteins in the R pal bacterium," Hettich continues. "Then we try to determine how much of each protein is present when R pal is grown under different conditions. Mass spec is the best tool for not only identifying proteins but also for characterizing their PTMs."
Recently, Hettich, Hurst, VerBerkmoes, and postdoctoral associate Michael Brad Strader identified and characterized the 54 proteins that make up the R pal ribosome, the cell's protein "factory." VerBerkmoes and his collaborators also catalogued all the proteins produced in R pal by its various growth states and measured changes in their abundance. "Our study is the first global look at R pal under all its growth states," Hettich says. "We provided a useful starting point for many biological investigations of this microorganism. We identified proteins that were either unknown previously or were not expected to be so important under different growth states."
"ORNL has identified more than a dozen protein complexes so far," Buchanan says. "Our target for 2005 is to identify and characterize 500 protein complexes through work with our collaborators, especially DOE's Pacific Northwest National Laboratory."
RNA and Microarrays
bacterium that could be useful to DOE is Shewanella oneidensis because
of its potential for converting radioactive uranium compounds into
a less soluble state so that
they sink into the sediments or stay put in soil. A DOE objective is
to prevent uranium contaminants from dissolving in groundwater and
flowing off site, where the uranium could endanger public health.
To help answer questions about Shewanella, DOE has sought help from the group led by Jizhong Zhou, a pioneer in the environmental applications of microarrays and a group leader in ORNL's Environmental Sciences Division. A microarray is the only available tool for capturing genome-wide, or global, information about the intricate timing and coordination of gene regulation at the level of RNA in bacterial cells. With a grid of red and green dots of different brightnesses, a microarray indicates which genes encode a high level of protein production and which ones instruct the host cell to produce little or no protein. The process allows scientists to compare gene activity in different microbes and their mutants when exposed to toxic metals such as uranium, strontium, and chromium.
"We have created 40 different mutants of Shewanella bacteria," Zhou says. "Mutant bacteria are important to the understanding of the functions of genes. We are using microarrays to determine which bacterial genes encode proteins under different conditions. That way we will find out which genes enable a bacterium to effectively reduce a target contaminant despite the presence of other toxic materials."
Imaging Live Cells in Action
way to observe which proteins are together in a complex is live-cell
imaging. Mitch Doktycz and his colleagues in LSD are developing ORNL's
Doktycz's group has an epifluorescent microscope and a recently acquired confocal laser scanning microscope, now the standard tool for live-cell imaging. The instrument enables researchers to see which proteins are interacting with each other and with other molecules inside a live cell in real time.
The Computer Connection
Researchers in LSD's Genome Analysis and Systems Modeling Group, led by Frank Larimer and Ed Uberbacher, are a key part of the pipeline. These researchers are also part of the Computational Biology Institute in DOE's Center for Computational Sciences at ORNL, which houses several supercomputers. They develop and apply algorithms, models, pattern recognition programs, and simulation methods and work on automating the pipeline's computational part.
The major emphasis of the group has been to identify genes in sequenced DNA. The researchers found genes in 100 microbes and in human chromosomes 5, 16, and 19 after they were sequenced by JGI. In addition, the group has developed the PROSPECT algorithm to predict the three-dimensional structures of proteins—important clues to their functions. To support the Genomics: GTL project, this group of computational biologists is improving the flexibility, efficiency, and accuracy of peptide identification algorithms.
software is designed to handle relatively small data collections.
To better understand what protein complexes do in bacterial cells, Larimer and his colleagues characterize and describe the components of a cell and its environment. They "guess" which proteins are processing specific metabolites, which include sources of energy, carbon, and nitrogen needed by cells. Then they build a model to describe how the organism works, while characterizing the functions of its components. "We may build a systems model of a bacterial community," Larimer says, "and predict what the community will do if a toxic metal is added."
Systems biology remains a tough challenge. ORNL is counting on new state-of-the-art tools and facilities, combined with an excellent staff and collaborators, to demonstrate the feasibility of assembling and operating an automated pipeline. ORNL researchers are increasingly confident that these assets will help lead them to significant scientific discoveries about biological systems.
Web site provided by Oak Ridge National Laboratory's Communications and External Relations