DOE Genome Contractor-Grantee Workshop IX
January 27-31, 2002 Oakland, CA
26. Method for Fast and Highly Parallel Single Molecule DNA Sequencing
Jonas Korlach, Michael Levene, Stephen W. Turner, Harold G. Craighead, and Watt W. Webb
School of Applied and Engineering Physics, Cornell University, Ithaca, NY 14853
There is an urgent need for the ability to sequence single molecules of DNA with long read lengths to provide intrinsic haplotyping and SNP profiling capability, enable the detection of extremely rare species or strains in a sample, and eliminate prior amplification steps.
Our project of developing such a DNA sequencing method is based on the proposition that the temporal order of base additions during DNA polymerization of a single nucleic acid molecule can be measured in real time. The activities of single molecules of DNA polymerase are observed in a detection format allowing the separate observation of many individual molecules simultaneously, thus creating a very fast and highly parallel new method of DNA sequence acquisition. The properties of the enzyme and the format of data acquisition facilitate evolution of this technology into compact and inexpensive DNA sequence analysis systems, suitable for rapid de novo or resequencing of DNA using unprocessed, unpurified samples. Hundreds of sequencing reactions could be performed simultaneously, with read lengths of individual reactions ranging into the tens to hundreds of kilobase pair size range.
In the course of developing this technology, challenges had to be solved regarding (i) the design, synthesis and evaluation of suitable fluorescently labeled nucleotide analogs which are both amenable to single molecule detection and efficiently utilized by the polymerase, and (ii) the fabrication of nanostructured devices permitting detection of single molecule polymerase activity at high fluorophore concentrations. Single base incorporation events were observed recently using this technology, proving the principal validity of this novel sequencing approach.
As efficient DNA synthesis occurs only at substrate concentrations much higher than the pico- or nanomolar regime typically required for single molecule analysis, zero-mode waveguide nanostructures are described as a way to overcome this limitation. They effectively reduce the observation volume to tens of zeptoliters (10-20 l), thereby enabling an increase in the upper concentration limit amenable to single fluorophore detection. Zero-mode waveguides thus extend the range of biochemical reactions that can be studied on a single molecule level into the micromolar range.
Supported by DOE grant DE-FG02-99ER62809.
27. Fast Detection of Nucleic Acid Hybridization with a Tapered Optical Fiber Sensor
Hyunmin Yi2, Vildana Hodzic2, James J. Sumner2, Matthew P. Delisa2, Saheed Pilevar2, Frank H. Portugal2, James B. Gillespie2, Christopher C. Davis2, and William E. Bentley1
1Center for Agricultural Biotechnology and Department of Chemical
Engineering, University of Maryland, College Park, MD 20742
A tapered single-mode optical fiber sensor was used to investigate gene regulation by hybridization of nucleic acids. The sensor is based on the evanescent field excitation of fluorescence of surface-bound fluorophores. The same tapered optical fiber is used to excite and collect fluorescence. This sensor design can potentially eliminate a number of problems typically encountered in other systems. Use of infrared fluorophores prevents background signals from natural visible region fluorescence, making extensive purification procedures and (RT)PCR amplification unnecessary. A pulsed mode operation prevents photobleaching of the dye and enables multiple detections. Further, the sensor surface is easily regenerated and the signal detection is nearly instantaneous for on-line monitoring of the target gene transcription without additional analyte labeling. Various derivatives of fluorescein were used to quantitate chemically modified sensor surfaces and test alternative chemical crosslinkers. Oligonucleotides of 20 base pairs with functional groups at either end (5' or 3') were covalently / noncovalently incorporated onto the sensor surface and used as probe molecules for complementary analyte strands in the samples. Regulation of dnaK gene in a high cell density culture of Escherichia coli was investigated as a model system. Finally, Near-field Scanning Optical Microscopy (NSOM) and evanescent wave excitation of surface-bound fluorophores on a prism surface were used to further investigate surface behavior.
28. High Performance Capillary Electrophoresis in DNA Sequencing and Analysis: Recent Developments
Barry L. Karger, Lev Kotler, Arthur Miller, and Hui He
Barnett Institute, Northeastern University, 360 Huntington Avenue, Boston, MA 02115
Our laboratory has devoted efforts to enhance separation of Sanger sequencing fragments to increase productivity of DNA sequencing. In the past, we developed linear polyacrylamide (LPA) to be the matrix with the longest read lengths (1300 bases in 2 hours). Recently, we focused in detail on the reasons for this performance. We compared N,N-dimethylacrymide with LPA and found that a key advantage of LPA is its ability to achieve high performance at 70-75° C, whereas the more hydrophobic polymer significantly loses separation above 50° C, due to a less robust entanglement structure. A second factor in our long read length ability relates to a base-caller that is able to read sequence down to peak resolution as low as 0.25. We have also recently shown that read lengths of 975 bases can be achieved in 40 min. by careful attention to the denaturants used in the buffer. These results will be described.
Additionally, we will describe the use of an automated fraction collector for capillary array electrophoresis for determination of mutant species. Mutants are separated from wild type by constant denaturant capillary electrophoresis. The collected components are then sequenced for determination of the specific mutations. The automated multiple array fraction collection approach can be also used for determination of differentially expressed cDNA’s, especially useful for organisms whose genomes are not yet sequenced. We will illustrate these applications in this poster.
29. Microchannel DNA Sequencing by End-Labeled Free Solution Electrophoresis (ELFSE): Development of Polymeric End-Labels, Wall Coatings, and Electrophoresis Methods
Wyatt N. Vreeland, Jong-In Won, Robert J. Meagher, M. Felicia Bogdan, and Annelise E. Barron
Northwestern University, Dept of Chemical Engineering, Evanston IL USA 60208
High-performance DNA separation matrices required for sequencing, which are based on viscous, entangled polymer solutions, require application of high pressure for rapid loading into the narrow-diameter channels used in current state-of-the-art microchannel electrophoresis sequencing instruments. We are working to develop a new method of free-solution DNA sequencing based on the separation of DNA-protein bioconjugates, which still uses the Sanger ddNTP chain termination reaction but obviates the need for a “gel”. As such, this new method, called End-Labeled Free Solution Electrophoresis (ELFSE) is especially amenable to high-field electrophoresis in microfluidic chips.
ELFSE functions through the attachment of a monodisperse, uncharged, polymeric “molecular drag-tag” to the 5’ terminus of DNA sequencing fragments. The fixed amount of hydrodynamic drag this frictional label engenders for each Sanger sequencing fragment enables sized-based separation of DNA by electrophoresis in free solution (i.e., in the absence of a sieving matrix). The scaling law for the electrophoretic mobility of a sequencing fragment is given by equation 1:
where r and x are the charge and effective hydrodynamic friction of a single DNA base respectively, N is the number of DNA bases in the sequencing fragment, and b and a are the amount of charge and drag of the end label in units of DNA bases. Ideally, b = 0, that is, the labels are uncharged. Thus to enable long-read DNA sequencing, polymeric end-labels that engender substantial hydrodynamic drag are required, as labels that provide low amounts of hydrodynamic drag are only effective in separating DNA fragments of short lengths.
The production of a high molar mass, uncharged and completely monodisperse polymer that will not substantially interact with microchannel walls and that has a unique point for attachment of DNA sequencing fragments is not a trivial task. Specifically our efforts have focused on determining what chemical characteristics are necessary for an optimal drag-tag, and development of cloning methodology to allow the production of high-molar mass protein polymers. To this end, we have developed a unique cloning strategy to produce protein polymer end-labels of large and controlled length (up to 1000 amino acids) from synthetic genes in E.coli. To date we have employed ELFSE-type separations for the molar mass profiling of the end-label molecules, highly multiplexed genotyping of clinically relevant DNA samples via Single Base Extension (SBE) methodology, and demonstrated separation of end-labeled DNA sequencing fragments. We have developed a highly hydrophilic, adsorptive polymer wall coating that enables high-resolution separation of DNA-protein conjugates. Current challenges we are addressing include optimization of the chemical nature of the protein polymer end-label as well as of the electrophoretic separation techniques and protocols.
30. Microfabricated Fluidic Devices for the Analysis of Genomic Materials
K. A. Swinney, R. S. Foote, C. T. Culbertson, S. C. Jacobson, and J. Michael Ramsey
Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831-6142
We are developing monolithic microfabricated fluidic devices for the analysis of genomic materials including DNA, peptides, proteins and cells. Microfluidics offers many potential advantages for performing automated and rapid analyses of small quantities of biological materials. Most strategies for analysis of genomic materials include a chemical separation process. Microfabricated separation devices have typically provided a separation performance equivalent to conventional laboratory technology using orders of magnitude less sample materials and taking one-to-two orders of magnitude less time.
One of the limitations of microfabricated separation devices is the resolving power achievable with in a small footprint, e.g., a few centimeters on a side. We have been investigating strategies that could allow greater separative performance using short separation distances. Such capabilities are necessary to keep devices small enhancing practicability and reducing costs. Microfabricated devices and biochemical strategies that allow the comprehensive analysis of proteins are also being developed. Comprehensive two-dimensional separations have been demonstrated for peptides that yield a peak capacity of approximately 1000. Improvements of a factor of at least four appear possible. We are attempting to move this technology toward the analysis of protein mixtures. Moreover the use of microfluidics for the automated analysis of the contents of single cells is being pursued.
The goal is to automate the loading of cells with reagents, incubation, lysis and analysis of cellular contents through chemical separations. Initial results have been obtained from a device that accomplishes the latter two steps. Jurkat cells have been loaded with Oregon Green and the automated analysis of individual cells produces an electropherogram for each one showing the Oregon Green and its metabolites. The analysis rate in this case was approximately 10 cells/min. Various aspects of these devices will be described.
31. Molecular Gates for Improved Sample Cleanup and Handling in Microfabricated Devices
Tzu-Chi Kuo, Donald M. Cannon, Mark A. Shannon, Paul W. Bohn, and Jonathan V. Sweedler
Department of Chemistry, Department of Mechanical Engineering, and Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois 61801
The development of integrated systems capable of automated accurate sequence generation from sample introduction to sequence output is an important goal of the DOE Human Genome Project. Microfabricated DNA analyzers (such as microfabricated PCR systems with integrated CE systems) have been developed that offer a number of important advantages compared to the traditional large-scale methods. However, the actual interface between these microfabricated subassemblies can be problematic.
Extension of microfabricated microfluidic devices to three-dimensions opens new vistas for applications and parallels the massively three-dimensional architectures characteristic of electronic devices. Externally controllable interconnects, employing nuclear track-etched polycarbonate membranes with nanometer diameter pores, are described that produce hybrid three-dimensional fluidic architectures. Using nanofluidic structures to connect microfluidic channels allows a variety of flow control concepts to be implemented, leading to hybrid fluidic architectures of considerable power and versatility. The key distinguishing characteristic feature of nanofluidic channels is that fluid flow occurs in structures of the same size as physical parameters that govern the flow. Furthermore, the separations capacity factor, k', governed by the surface-to-volume ratio, can be quite large. For example, k' increases by ~120 when a 200 nm i.d. nanopore with a 10 nm thick coating is compared with a 20 micron i.d. wall-coated open tubular column with the same coating. These nanofluidic interconnects can be thought of as fluidic diodes, albeit with a much richer array of parameters to control biasing.
Forward/reverse bias is controlled by applied potential, surface charge density (pH controllable), ionic strength, and even by the characteristics of the fluidic network in which the interconnect is placed. We demonstrate the use of these interconnects to collect an analyte band from an electrophoretic separation and transport the band to another fluidic layer. The construction and operating characteristics of these devices is described for a variety of applications.
The successful molecular gate has the ability to transfer a particular DNA band on-device after PCR analysis to another fluidic layer to allow sample cleanup and even to capture a particular DNA band eluting from the separation channel for further characterization. This allows easier interfacing between the separate components of a total “lab-on-a-chip” sequencer. We have optimized- molecular gate technology and are developing the protocols for high efficiency and fast sample capture and release.
32. Electron Tomography of Whole Cells
Grant J. Jensen and Kenneth H. Downing
Lawrence Berkeley National Laboratory
Recent advances suggest that whole microbial cells could be imaged by electron tomography to “molecular” resolution, sufficient to locate and identify large macromolecular complexes in their native state within their cellular contexts. Such an advance could prove crucial for the success of the Microbial Cell Project, as no other existing imaging modality can be expected to provide the high resolution structural information necessary to characterize many gene products, track complex formation and pathways, finely localize structures and functions within the cell, and understand the detailed affects of future attempts to customize microbes. Electron tomography proceeds by quick freezing whole cells in a thin, aqueous film across an electron microscope grid, recording projection images of them from multiple directions in the microscope, and combining these images in a computer to produce a three-dimensional reconstruction. We propose to develop and test this technique on two microbes: one chosen for its ideal imaging characteristics, and the other for its potential Department of Energy mission relevance.
This project combines the expertise of key pioneers in the fields of high resolution protein and cellular imaging by electron microscopy, an ideally equipped electron microscope, and a uniquely well suited model system that will clearly reveal the potential contribution of electron tomography towards the goals of the Microbial Cell Project.
33. Cast Thy Proteins Upon the Water: Fluid Proteomics in a 2–D World
Barry Moore, Chad Nelson, Mike Giddings, Mark Holmes, Melissa Kimball, Norma Wills, John Atkins, and Ray Gesteland
Department of Human Genetics, University of Utah, Salt Lake City, Utah, USA
Genome sequencing has been an exceptionally successful endeavor with completion of over 60 non-viral genomes, and another 573 underway. Understanding the functional products of these genomes – proteins – has developed into the perhaps more difficult, but nonetheless rapidly developing field of proteomics. Much of the energy and resources in proteome projects has been invested in peptide mapping by 2-D gel electrophoresis followed by MALDI-MS analysis. One shortcoming of the 2-D gel approach, however, is that it does not allow for determination of the full length protein mass as the proteins must be proteolyitcally digested in order to be removed from the gels.
We have developed a two dimensional LC-MS approach to protein identification. This approach allows us to determine the mass of the full length protein as well as peptide mapping of it’s fragments – potentially providing information on post-translational modifications, signal peptide cleavage sites, cotranslational recoding events, and errors in ORF prediction found in the database. Furthermore, while most peptide mapping search algorithms rely on searching predefined ORFs, we have developed a searching algorithm that searches total genomic sequence for peptide matches thus allowing peptide mapping to identify programmed frameshifting events and short or ambiguous ORF proteins rejected by ORF prediction software, and thus unavailable for search by ORF based peptide mapping search algorithms.
We have applied these techniques to yeast mitochondria, and have identified a number of proteins, providing new information on the mass of predicted, but unknown proteins and corrections to database errors.
34. Single Cell Proteome Analysis — Ultrasensitive Protein Analysis of Deinococcus radiodurans
Shen Hu, Amy Dambrowitz, Roger Huynh, and Norm Dovichi
Department of Chemistry, University of Washington
We are developing technology to monitor changes protein expression in single tetrads of D. radiodurans following exposure to ionizing radiation. We hypothesize that exposure to ionizing radiation will create a distribution in the amount of genomic damage and that protein expression will reflect the extent of radiation damage. To test these hypotheses, we will develop the following technologies:
These technologies will be combined to determine protein expression in single tetrads of D. radiodurans, the extent of DNA damage following exposure to Cs-137 radiation, and the amount of chromosomal and rRNA per cell. This technology will be a powerful tool for functional analysis of the microbial proteome and its response to ionizing radiation.
35. High-Throughput SNP Scoring with GAMMArrays: Genomic Analysis Using Multiplexed Microsphere Arrays
P. Scott White, Hong Cai, David Torney, Lance Green, Diane Wood, Francisco Uribe-Romeo, LaVerne Gallegos, Julie Meyne, Paul Jackson, Paul Keim, and John Nolan
Bioscience Division and Theoretical Division, Los Alamos National Laboratory and Department of Microbiology, Northern Arizona University
We have developed a platform for the discovery and scoring of SNPs that is capable of meeting greatly increasing demands for high throughput and low cost assays. Called GAMMArrays, or Genomic Analysis using Multiplexed Microsphere Arrays, the basic platform consists of fluorescently labeled DNA fragments bound to microspheres, which are analyzed using flow cytometry. The platform provides no-wash assays that can be analyzed in less than 1 minute per sample, with sensitivities far superior to other approaches. SNP scoring is performed using minisequencing primers and fluorescently labeled dideoxynucleotide terminators. Furthermore, by using commercially available sets of multiplexed microspheres it is possible to score dozens to hundreds of SNPs simultaneously. Multiplexing, when coupled with the high throughput rates possible with this platform makes it possible to score several million SNPs per day at costs that are a fraction of competing technologies.
GAMMArrays are enhanced by the use of universal oligonucleotide tags. These tags consist of carefully designed, unique DNA tails, or capture tags, incorporated into each minisequencing primer, that are complementary to an address tag attached to a discrete population of microspheres in a multiplexed set. This enables simultaneous minisequencing of large numbers of SNPs in solution, followed by capture onto the appropriate microsphere for multiplexed analysis by flow cytometry.
We present results from multiplexed SNP analyses of bacterial pathogens, and human mitochondrial DNA and HLA genetic variation. These analyses are performed on a small number of relatively large PCR amplicons, each containing numerous SNPs that are scored simultaneously. In addition, these assays are easily integrated into conventional liquid handling automation systems, and require no unique instrumentation for setup and analysis. Very high signal-to-noise ratios, ease of setup, flexibility in format and scale, as well as low cost of these assays make them highly versatile and extremely valuable tools for a wide variety of studies where SNP scoring is needed.
36. Characterization of the D. radiodurans Proteome using Accurate Mass Tags
R. D. Smith1, G. A. Anderson1, M. S. Lipton1, L. Pasa-Tolic1, J. Fredrickson1, J. R. Battista2, M. J. Daly3, C. Masselon1, R. J. Moore1, M. F. Romine1, Y. Shen1, and H. R. Udseth1
1Pacific Northwest National Laboratory
In our approach to microbial proteomics our objective is to circumvent the limitations of conventional approaches by directly characterizing the cell’s polypeptide constituents using a combination of high resolution separations and the mass accuracy and sensitivity obtainable with Fourier transform ion cyclotron resonance (FTICR) mass spectrometry. Protein identification is based upon global approaches for protein digestion and accurate peptide mass analysis for the generation of “Accurate Mass Tags” (AMTs). Our two-stage strategy exploits FTICR for validation and subsequent routine measurement of peptide AMTs from “potential mass tags” initially identified using tandem mass spectrometry methods, and thus providing the basis for high throughput proteome-wide measurements. A single high resolution capillary liquid chromatography separation combined with high sensitivity, high resolution and accurate FTICR measurements has been shown to be capable of characterizing peptide mixtures of more than 100,000 components, sufficient for broad protein identification in microbial systems. Attractions of the approach include the capability for automated high-confidence protein identification, broad and unbiased proteome coverage, and the capability for exploiting stable-isotope labeling methods for quantitative relative protein abundance measurements. Using this strategy, we have been able to identify AMTs for >60% of the potentially expressed proteins in the organism Deinococcus radiodurans. Approximately 32% and 16% of the ORFs from the D. radiodurans database are predicted to be hypothetical (having no significant homology to any proteins in any other public genome sequence databases at the time of annotation) and conserved hypothetical (having limited homology to a functionally uncharacterized ORF), respectively. We identified 48% of these hypothetical proteins and 55% of the conserved hypothetical proteins. The approach will also be shown to allow the detection of modified proteins, as well as quantitative measurements of changes in protein abundances resulting from environmental perturbations.
This work was supported by the Office of Biological and Environmental Research of the U.S. Department of Energy. Pacific Northwest National Laboratory is operated for the U.S. Department of Energy by Battelle Memorial Institute through Contract No. DE-AC06-76RLO 1830.
37. Combining “Top-Down” and “Bottom-Up” Mass Spectrometry Approaches for Proteomic Analysis: Shewanella oneidensis — A Case Study
Robert Hettich, Nathan VerBerkmoes, Jonathan Bundy, James Stephenson, Loren Hauser, and Frank Larimer
Oak Ridge National Laboratory
The current rapid expansion of the field of proteomics will help this become one of the key components of the DOE-BER Genomes-To-Life Project, in particular assisting in the determination of gene function for the Microbial Genomics Program. The development of methods for rapid, large-scale mass spectrometry (MS) analyses of proteins from complex biological samples is considered to be critical for proteome studies. Two major approaches for MS-based proteome analysis have been employed to date. In the most common “bottom-up” approach, proteins are separated, proteolytically digested, and subsequently identified via MS analysis of the resultant peptides. An alternative approach, termed the “top-down” method, involves analysis and identification of intact proteins via accurate mass measurement. Since an intact protein mass is measured, this method may be advantageous for the detection of post-translational modifications, which may be missed in analyses by the bottom-up approach, where only a fraction of the total peptide population may be detected.
We have developed a novel method for proteome analysis that integrates both the top-down and bottom up approaches, capitalizing on the unique capabilities of each method. Bacterial cellular lysates were initially fractionated via anion exchange liquid chromatography. Each fraction, which consisted of intact protein species, was divided into two samples. The first sample of each fraction was digested with trypsin protease and analyzed with reversed phase-LC-MS/MS for protein identification. This two dimensional separation strategy greatly enhanced the number of proteins that could be identified as compared to a similar analysis of an unfractionated lysate. The second sample of each fraction was analyzed on an electrospray Fourier transform mass spectrometer for high-resolution identification of the intact proteins with the top-down approach. The use of the two methods in concert enabled the facile detection of such common post-translational modifications as loss of N-terminal methionine and signal peptide cleavages. This new approach was applied in a preliminary proteomic analysis of Shewanella oneidensis, a metal reducing microbe of potential importance to DOE in the field of bioremediation. With this experimental MS approach, it was possible to identify over 700 proteins from S. oneidensis, with the identification including ribosomal proteins, hypothetical proteins, suspected metal reducing proteins, and membrane proteins. One key protein, an ATP-binding protein, was found a have a molecular mass that was about 3 kDa lighter than that expected from the gene sequence. Detailed inspection of high-resolution MS data indicated that a 20 amino acid signal peptide had been cleavaged off the N-terminus of the protein. This experimental data is being used to refine bioinformatic tools that are used to predict the presence and identity of signal peptides. We feel that this level of information is a unique capability of our combination “top-down” — “bottom-up” MS proteomic method.
38. Oligonucleotide Mixture Analysis via Electrospray and Ion/Ion Reactions
Scott A. McLuckey1, Jin Wu1, Jonathan L. Bundy2, James L. Stephenson, Jr.2, and Gregory B. Hurst2
1Department of Chemistry, Purdue University, West Lafayette, IN
Electrospray ionization combined with ion/ion reactions in a quadrupole ion trap can be used for the direct analysis of oligonucleotide mixtures. Elements to the success of this approach include factors related to ionization, ion/ion reactions, and mass analysis. This work deals with issues regarding the ion polarity combination, viz., positive oligonucleotides/negative charge transfer agent versus negative oligonucleotides/positive charge transfer agent. Anions derived from perfluorocarbons appear to be directly applicable to mixtures of positive ions derived from electrospray of oligonucleotides, in directly analogy with positive protein ions. Conditions for forming positive oligonucleotide ions devoid of adducts were more difficult to establish than for forming relatively clean negative oligonucleotide ions. A new approach for manipulating negative ion charge states in the ion trap is described and is based on use of the electric field of the positive charge transfer agent for storage of high mass negative ions formed during the ion/ion reaction period. Oxygen cations are shown to be acceptable for charge state manipulation of mixed-base oligomers but induce fragmentation in poly-adenylate homopolymers. Protonated isobutylene (C4H9+), on the other hand, is shown to induce significantly less fragmentation of poly-adenylate homopolymers. This presentation will describe the results mentioned above along with new developments in instrumentation that will lead to improvements in mass analysis. Two new systems are expected to come on line in the coming months. One is comprised of a three dimensional quadrupole ion trap with at least twice the mass-to-charge range of the previous system and two to four times the resolving power. The other system will employ trapping in a two-dimensional quadrupole ion trap followed by time-of-flight mass analysis. This system promises to provide improvements in mass resolution, mass accuracy, and speed.
39. Peptide Sequencing and Identification Using de novo Analysis of Tandem Mass Spectra
William R. Cannon, K. D. Jarman, and K. H. Jarman
Pacific Northwest National Laboratory
Algorithms for sequencing peptides from mass spectrometry data have been seen as increasingly important as researchers start to think beyond genomic data towards the study of proteins. In many proteomic analyses, tandem mass spectrometry is used to identify peptides, which can then be linked back to their parent protein. Presently, most analyses of proteomic tandem mass spectra rely on comparisons with static information within genomic sequence databases. For example, the peptide dissociation pattern from MS/MS is compared to hypothetical dissociation patterns for peptides that are computationally generated from a sequence database. As such, the ability to discover new knowledge about post-translational modifications, mutations, unexpected reading frames and sequencing errors is severely compromised.
An alternative is to derive the protein sequence information de novo by exploiting the sequence information contained in tandem mass spectra. That is, since a MS/MS experiment on a biopolymer results in a sequential fragmentation of the parent biopolymer into daughter fragments, it is possible to determine the sequence of the parent peak from only the mass spectrum of the daughter peaks and knowledge of the mass of each monomeric subunit of the polymer. Here we report on the use of graph theory and set covering techniques that incorporate peak intensities as well as mass values to identify peptides without reliance on sequence databases.
40. Laser Desorption Mass Spectrometer for DNA Sequencing and Hybridization Detection
Winston C. H. Chen1, Steve L. Allman1, Klara J. Matteson2, and Lauri Sammartano3
1Oak Ridge National Laboratory
During the past few years, we have developed various approaches to sequence DNA and to measure DNA hybridization with mass spectrometries.
For DNA sequencing, both Sanger’s enzymatic synthetic method and Maxam Gilbert’s chemical degradation method have been pursued. Single stranded DNA with size up to 130 nucleotides and double stranded DNA with size up to 200 base pairs have be sequenced with laser desorption mass spectrometry. One approach is to use matrix-assisted laser desorption/ionization (MALDI) for DNA detection. We also developed a novel approach with laser induced acoutic desorption for DNA detection. In addition to the sequencing with DNA ladders, we also developed direct DNA sequencing technology without the need of DNA ladders. This technology is particularly suitable for primer and DNA probe analysis.
Microarray DNA hybridization has been considered as a valuable tool for high throughput DNA analysis. We have developed mass spectrometry technology for DNA hybridization analysis. With mass spectrometry as a detector, multiplexing hybridization on a single hybridization spot can be achieved. Thus, the throughput can be further increased. Furthermore, it is easier to distinguish perfect hybridization from single base mismatched hybridization. SNP (single nucleotide polymorphism) can be quickly analyzed with this approach. Hybridization with genomic DNA with mass spectrometry detection has also been demonstrated. Experimental details will be presented in poster.
41. Novel Molecular Labeling for Post-Genomic Studies
Xian Chen, Tom Hunter, Fadi Abdi, Haining Zhu, John Engen, Songqing Pan, Sheng Gu, Li Yang, Morton Bradbury, and Vahid Majidi
C-ACS, Chemistry Division, Bioscience Division, Los Alamos National Laboratory
Now that scientists have mapped the human genome, an even greater challenge has come to light: making sense of vast amount of information contained in a genome. This challenge can be broken down into two main areas: functional genomics, which involves the further and accurate study of DNA sequence diversity to understand their function, and proteomics, the study of the full repertoire of proteins encoded by a genome. Mass spectrometry (MS) is a promising tool for rapid, rigorous, and sensitive analyses in both areas, but critical advances are needed to increase its specificity and accuracy for the analyses at the genomic level. To address these cutting-edge issues, we have developed a novel MS-based Mass Tagging technique. On a genomic scale, our technique uses stable-isotope-labeled precursors—particular nucleotides for DNA, or amino acids for proteins—to label DNA or protein molecules residue-specifically for MS analysis. Displayed through a characteristic mass-split pattern induced by labeled precursors, the content of particular nucleotides or amino acid residues in particular DNA or protein fragments can be readily determined. We have applied this strategy successfully to many aspects of functional genomics and proteomics, including screening SNP, validating DNA sequencing data with ambiguities left-open by gel electrophoresis, identifying cellular proteins including membrane-bound and scarce proteins, detecting post-translational modifications in a residue-specific manner, and analyzing contact interfaces in protein/protein complexes. Our strategy of nucleotide- or amino acid-specific mass tagging in DNA or protein molecules provides a much more sensitive and accurate way of molecular labeling than radiological or chemical labeling. In addition to the parameter of mass-to-charge ratio (m/z), the use of these site-specific stable-isotope labels tagging biological molecules in a sequence-specific way have dramatically enhanced the specificity, accuracy, sensitivity, and throughput of the MS-based technology for functional genomics and proteomics analyses.
Our publications in the past two years are included as follows
42. Monolithic Integrated PCR Reactor-CE Microsystem for DNA Amplification and Analysis to the Single Molecule Limit
Eric T. Lagally1, Chung N. Liu2, and Richard A. Mathies1,3
1UCB/UCSF Joint Bioengineering Graduate Group, Berkeley, CA 94720
Microfabrication is an effective method for creating integrated microfluidic devices for high-performance chemical and biochemical analysis (1-3). Our early work included the development of the first integrated PCR reactor with a CE system (4) and the development of a CE chip with an integrated electrochemical detector (5). More recently, we have devised a monolithic system for conducting the polymerase chain reaction (PCR) directly connected to a capillary electrophoresis (CE) microchannel for product separation and analysis (6). Samples are loaded precisely into a 150-300 nL PCR reactor using 50 nL valves and hydrophobic vents. The sample is cycled between three temperatures using a resistive heater mounted on the bottom of the chip, and amplification products are directly injected and separated on the capillary electrophoresis channel. The device takes as little as 30 seconds/cycle, representing a vast improvement over conventional thermal cycling systems, which can take up to 5 minutes/cycle.
Our PCR-CE device has recently demonstrated successful detection of a PCR product amplified from a single DNA template molecule, bringing this technology to the limiting molecular sensitivity of the PCR reaction (7). In this work, a single M13 DNA template molecule was co-amplified with a control that was outside the stochastic regime. Calculation of peak area ratios between the two template products reveals discrete clusters, and these clusters conform to the expected Poisson distribution for stochastic single molecule events with a mean occupancy of 0.9 molecules. This device has the highest molecular sensitivity ever demonstrated in a PCR chip device. Previous static systems required ~6,000 starting copies (8), and continuous-flow geometries required as many as ~108 starting copies (9). High-sensitivity analyses, such as comparative gene expression studies from individual cells, can also be performed using these devices.
Recent improvements include the fabrication of a PCR-CE device with integrated resistance temperature detectors (RTDs) and heaters (10). Sputtered platinum four-wire RTDs are fabricated inside the PCR chambers and platinum heaters with gold leads are fabricated on the backside of the device using sputtering and electroplating processes. Both elements connect to outside electronics using standard PC board connectors. These integrated components have resulted in more accurate temperature measurement and more efficient heating of the PCR chamber than previously possible due to the closer proximity of the RTD to the sample and better thermal contact between the heater and the glass wafer. Successful amplification of a multiplex human sex-determination amplification of the centromeric alphoid repeat (11) in a one-step reaction from human buccal cells has recently been accomplished with this device.
We are also developing technology to enable a 96-sample PCR-CAE microplate for high-throughput integrated PCR-CE analyses from small amounts of template DNA. Current progress includes exploration of different valve and vent structures to reduce valve dead volumes and to increase the scalability and ease of fabrication of the device. Such an integrated device will also be capable of multiplexing with disparate amplification protocols while greatly reducing sample and reagent volumes and costs.
These results demonstrate a key advance in the development of an integrated microfluidic system that performs complete genetic analyses at sub-microliter volumes, useful in the areas of point-of-care genetics and rapid identification of infectious diseases.
43. Advances in Radial Capillary Array Electrophoresis Chip Sequencing and Genotyping Technology
Brian M. Paegel1, Robert G. Blazej2, Lorenzo Berti1, Charles A. Emrich3, James R. Scherer1 and Richard A. Mathies1
1Department of Chemistry, University of California, Berkeley, CA
In 1999, we introduced the radial microfabricated capillary array electrophoresis (mCAE) chip, rotary confocal fluorescence detection, and their application to simple genotyping (1). We have more recently been working on applying this technology to DNA sequencing with the goals of increasing analysis speed, integration and automation of sample preparation, and reduction in costly reagent consumption. mCAE devices can help us to accomplish these goals by allowing high-speed DNA sequencing (2) and monolithic array construction for high-throughput analyses (3). Our current 96-lane mCAE device incorporates our radial array design (1) and hyper-turn channel geometries to extend separation channel lengths to 15.9 cm for DNA sequencing (4) . Performing high-quality DNA sequencing required overcoming such challenges as high-viscosity gel matrix filling, system buffering for prolonged periods of electrophoresis, and sample and buffer evaporation at 60° C. Using a high-pressure gel-filling instrument (5) and the Berkeley radial confocal fluorescence scanner, we have demonstrated sequencing of 41,000 phred 20 bases of a M13mp18 sequencing standard in only 25 minutes (6). We are also working closely with the DOE’s Joint Genome Institute to adapt the Berkeley mCAE platform for production scale sequencing samples.
We have also used the mCAE device to develop a new method for polymorphism identification and screening. Polymorphism ratio sequencing (PRS) employs a novel dye labeling and reaction scheme to produce sequencing extension ladders for unambiguous SNP scanning. Combined with mCAE, the entire human mitochondrial genome was compared to a reference sample in one run of a 96-lane sequencing device (30 min.). PRS leverages the exquisite sensitivity intrinsic to fluorescence techniques and internal controls to extend effective read length and improve accuracy. This is particularly advantageous in population studies to determine allelic frequencies. Titration studies on pooled DNA samples demonstrate minor allele frequency detection limits below 10% (7).
Development of the PRS method was facilitated by the production of novel energy-transfer (ET) dye-cassette labels. Cassettes contain a fluorescein donor moiety linked by a sugar-phosphate spacer to an emitter dye. The cassette terminates in a mixed disulfide, which can be easily coupled to thiol-activated oligonucleotides via disulfide exchange (8). Together, these technologies address some of the basic needs of current process biology efforts and will be the next-generation high-throughput DNA sequencing and genotyping methodology.
44. Integrated Platform for Detection of DNA Sequence Variants Using Capillary Array Temperature Gradient Electrophoresis
Zhaowei Liu1, Cymbeline T. Culiat2, Tim Wiltshire3, Christina Maye1, Heidi Monroe1, Kevin Gutshall1, and Qingbo Li1
1SpectruMedix Corporation, 2124 Old Gatesburg Road, State College,
With more sequence information available, studies of DNA sequence variants such as mutation and single nucleotide polymorphism (SNP) are an important next stage in genome research and disease studies. Although many techniques have been developed for the detection of sequence variants, sensitive, high-throughput, and flexible techniques are still required for accurate detection and characterization if large-scale scans for sequence variants are to be used effectively for establishing correlations between certain variants and behavior of a biological system.
We have developed a highly versatile platform that performs temperature gradient capillary electrophoresis (TGCE) for mutation/SNP detection, sequencing and mutation/SNP genotyping for identification of sequence variants on an automated 24-, 96- or 192-capillary array instrument. In the first mode, multiple DNA samples consisting of homoduplexes and heteroduplexes are separated by capillary electrophoresis, during which a temperature gradient is applied that covers all possible Tms for the samples. The differences in Tms result in separation of homoduplexes from heteroduplexes, thereby identifying the presence of DNA variants. The sequencing mode is then used to determine the exact location of the mutation/SNPs in the DNA variants. The first two modes allow the rapid identification of variants from the screening of a large number of samples. Only the variants need to be sequenced. The third mode utilizes multiplexed single base extensions (SBEs) to survey mutations and SNPs at the known sites of DNA sequence. The TGCE approach combined with sequencing and SBE is fast and cost-effective for high throughput mutation/SNP detection.
We further test the capabilities and sensitivity of TGCE in mutation scanning in the mouse genome by scanning candidate genes for ENU induced mutations, and scanning a mutagenized ES cell library for specific gene mutations. To achieve this we used a test set of DNAs from a collection of 480 450-500bp PCR fragments that had previously been sequenced in 6 different mouse strains for SNP discovery so all mutations were known in the fragments. This set of DNAs contained a range of polymorphisms including single base pair changes, multiple SNPs and a few deletions and insertions as well (48% no SNP, 52% one or more SNPs). We used all 5x96 well samples as a test set, even those that gave no PCR product or more than one PCR product, and analyzed results for all data. The TGCE method implemented is a highly sensitive, high-speed, and high-throughput technique for mutation analysis. We report that 95% of the TGCE analysis concur with direct sequencing results. In addition, TGCE detected four more mutations that were initially missed by direct sequencing.
45. New Microfabrication Technologies for High-Performance Genetic Analysis Devices
Charles A. Emrich2, Toshihiro Kamei1, Will Grover1, and Richard A. Mathies1
1Department of Chemistry, University of California, Berkeley, CA
Modern microfabrication technologies facilitate the creation of novel genetic analysis devices having dramatically enhanced capabilities. We present here the results of three research projects with respective aims of developing: (i) a 384-channel Capillary Array Electrophoresis (CAE) microdevice for ultra high-throughput genotyping; (ii) an integrated optical detection system using high-sensitivity a-Si:H PIN diodes fabricated on glass; and (iii) a microfluidic-based DNA computer.
Microfabricated devices are replacing conventional drawn silica capillaries for use in high-throughput electrophoresis assays. Toward this end, we have designed and successfully demonstrated a radial 384-lane CAE microdevice and used it to genotype 384 individuals in less than 7 minutes (1). The CAE devices were fabricated on 200-mm glass wafers with channels 50 µm in diameter. Detection was accomplished with our rotary confocal laser-induced fluorescence scanner (2). We demonstrated the efficacy of the device for genotyping by testing 384 individuals for the common H63D mutation in the human HFE gene from PCR-RFLP derived samples.
Integration of the fluorescence detector on-chip is a fundamental challenge that will lead to the development of portable point-of-care lab-on-a-chip (LOC) platforms. Conventional semiconductor photodiodes are candidates for integrated detectors, but the required high-temperature fabrication procedures are not compatible with the glass or plastic substrates used for most lab-on-chip devices. We have instead chosen (in collaboration with Xerox PARC) to develop photodiodes made from hydrogenated amorphous silicon (a-Si:H) which can be fabricated at low cost and produced in large arrays. We have successfully performed DNA fragment sizing using the a-Si:H detector coupled to our current confocal fluorescence station demonstrating the feasibility of miniaturizing and integrating such detectors into a portable LOC device.
Finally, we are developing a microfluidic device to serve as a programmable microprocessor for execution of DNA-based computation. Our current device is an orthogonal array of 81 interconnected chambers containing structures that capture, release, or redirect populations of oligonucleotides that flow through the chambers. Such a device was first postulated and modeled theoretically in 1999 by Gehani and Reif (3). We have expanded on their model by incorporating the traditional Boolean logic gates AND, OR, and NOT specified by the path followed by a particular DNA molecule through the microprocessor. Using micromole quantities of input oligonucleotides it should be possible to encode all possible solutions to a complex combinatorial problem or to aid in solving NP-hard problems. The DNA microprocessor can also be used as a haplotyping tool to detect linkages across different polymorphisms using partially digested genomic DNA as logical inputs.
46. Microarray Electrophoretic DNA Mapping System
Gregory Zeltser, Alfred Goldsmith, Ilya Agurok, and Paul Shnitser
Physical Optics Corporation
Physical Optics Corporation (POC) proposes to develop a novel Microarray Electrophoretic DNA Mapping (DNA-MEM) system based on a multistage electrophoresis chip to carry out rapid multiple DNA molecule homogeneous stretching and mapping with a resolution below kilobase. This system will bring needed improvements to current DNA optical mapping and DNA fiber FISH technologies especially in terms of resolution.
The DNA-MEM system consists of three major components: the multistage electrophoresis chip, spectroscopic imaging microscope, and laser diode heating subsystem. DNA molecules are stretched and immobilized into the 3-D mesh of the POC's hydrophilic polymer inside grooves of the chip. The DNA-MEM system will be able to process multiple samples at the same time. Moreover, the system can analyze both preliminarily stained DNA molecules suspended and stretched in porous polymer and DNA molecules that have been FISH stained after stretching and immobilization to the three-dimensional meshwork of the polymer. The DNA-MEM system design, analysis, and component development will be completed in Phase I.
Integrated biochip technology will be developed to the prototype stage in Phase II, and will be commercialized over the following two years as a Phase III engineering prototype.
216. A Microfabricated Hybrid Device for High-throughput and Long Read-length DNA Sequencing
Shaorong Liu, Jianzhong Zhang, Hongji Ren, and Jianbiao Zheng
To extend straight separation channel on electrophoresis chips for DNA sequencing, we have created a hybrid device that couples a microfabricated twin-T injector on a microchip to a separation capillary. Standard isotropic photolithograohic etching process creates semicircular grooves on a glass wafer (see Figure 1). Using a two-mask fabrication process, grooves of two different diameters can be made. Round channels of different diameters are formed when such two etched wafers are face-to-face aligned and bonded (see Figure 2). The capillary is inserted into a microchip through the larger channels, allowing integration of capillary bores to the rest of the channels on the chip. We have performed four-color DNA sequencing using this hybrid device. Using a 200-µm twin-T injector coupled with a capillary of 20-cm effective separation distance, readlengths of more than 800 bases have been obtained at an accuracy of 98.5% in 56 min. At higher separation field strength, the separation time has reduced to 20 min to separate 700 bases at 98.5% base- calling accuracy. A 16-channel hybrid device has recently been tested using real-world samples from JGI. Test of close to a thousand randomly selected samples has demonstrated an average readlength of 675 phred20 bases with a success rate of 91% in 60-70 min.
The online presentation of this publication is a special feature of the Human Genome Project Information Web site.