Human Genome 1993 Program Report: Small Business Innovation Research
Date Published: March 1994
SBIR awards are designed to stimulate commercialization of new technology for the benefit of both private and public sectors. The DOE SBIR program consists of three phases:
SBIR Phase I
Projects New in FY 1993
High-Performance Searching and Pattern Recognition for Human Genome Databases
Douglas J. Eadline
Paralogic, Inc.; Bethlehem, PA 18015
215/861-6960, Fax: -8247, Internet: deadline@plogic.com
The goal of the Human Genome Project is to determine the sequence of most or all 3 billion nucleotide bases. Resulting genome database size will require new technologies that provide cost-effective and time-effective searching, interpretation, and analysis. One very promising approach to this problem is to (1) apply grammars expressed in the Prolog language ("linguistic analysis") to the "genetic language" expressed by DNA and (2) search in a concurrent or parallel fashion. The recent emergence of parallel computers and parallel programming tools has created the opportunity for rapid and cost-effective linguistic analysis of DNA sequences. This project seeks to determine the feasibility of combining grammar-based searching, Paralogic n-parallel PROLOGþ and parallel computers to search genome databases efficiently and quickly.
A High-Spatial-Resolution Spectrograph for DNA Sequencing
Cathy D. Newman
CHROMEX, Inc.; Albuquerque, NM 87107
505/344-6270, Fax: -6095
The Human Genome Project, if it is to be successful, will require much more rapid DNA sequencing. Present technology is estimated to be 100 to 500 times too slow to permit sequencing of the entire human genome by 2005. The ABI 373A DNA sequencer, the most successful technology currently employed for large-scale DNA sequencing, involves real-time, four-color, fluorescent line imaging during the electrophoresis process. In our estimation, this basic technology can be scaled to significantly higher levels and is likely to remain the method of choice for future DNA sequencing. Indeed, a number of research groups have been developing faster fluorescent DNA sequencers. Although great strides have been made in reducing the time of electrophoresis by 10- to 20-fold, future gains in throughput will depend primarily on increased multiplexing. High-throughput instruments will require a detection system with hundreds of spatially resolved channels.
CHROMEX proposes to test a completely new concept in high-spatial-resolution spectroscopy. Performance will be achieved by using a radically different approach to the spectrograph design. Such a spectrograph, when combined with a charge-coupled device array detector, will produce a system capable of rapidly recording spectral signatures from more than 300 spatial locations.
Nonradioactive Detection Systems Based on Enzyme-Fragment Complementation
Peter Richterich
Collaborative Research, Inc.; Waltham, MA 02154
617/487-7979, Fax: /891-5062, Internet: peter@cric.com
Assays based on DNA hybridization are very commonly used in molecular biology, clinical diagnostics, and the Human Genome Project. However, efficiency and sensitivity of hybridizations are often limited because of the nonspecific binding of probe molecules or detection-system components to target DNA or membrane supports.
In this project, detection methods will be developed for extremely sensitive, reliable, background-free detection in hybridization-based assays. The system will be based on two neighboring hybridization probes, each labeled with one subunit of alkaline phosphatase. Individual subunits of alkaline phosphatase are inactive, and nonspecific binding of hybridization probes to target DNA or membrane support will therefore not lead to background noise. When two probes bind next to each other on the target DNA, the two enzyme subunits can combine and form the active dimeric enzyme, which can be detected with chemiluminescent or colorimetric substrates. Increased hybridization specificity results from the necessity for two probes to hybridize next to one another.
In Phase I of this work, the feasibility and potential of such binary detection systems will be demonstrated. Conditions for generation, purification, storage, and use of subunit-labeled oligonucleotide probes will be defined and optimized. Finally, the binary detection systems will be used for chemiluminescent multiplex sequencing.
Separation Media for DNA Sequencing
David S. Soane and Herbert H. Hooper
Soane Technologies, Inc.; Hayward, CA 94545
510/293-1850, Fax: -1860
High-speed high-throughput DNA sequencing methods often rely on electrophoretic separation in very narrow geometries such as microcapillaries and ultrathin slabs. As channel dimensions decrease, the separation medium becomes a critical limiting factor in the speed, accuracy, and reproducibility of DNA fractionations. Conventional in situ casting of gels is not well suited for confined geometries, and the current approach of using noncrosslinked, entangled polymer solutions has several performance deficiencies, including those of resolution and reproducibility. This project involves a completely novel separation-media concept that combines the desirable performance attributes of crosslinked networks, the loading/unloading feature of dilute polymer solutions, and the user convenience of precast gels. The new approach will use discrete "smart" gel particles that form a free-flowing pseudo network under appropriate conditions and can be readily injected into and flushed from narrow capillaries. The feasibility of using this technology for DNA separation by capillary electrophoresis will be investigated in this project.
Phase I
Projects Continuing into in FY 1993
Interactive DNA Sequence Processing for a Microcomputer
Wayne Dettloff and Holt Anderson
Advanced Technology Applications, Inc.; Research Triangle Park, NC 27709
919/248-1800, Fax: -1455
DNA and biosequences are being identified faster than they can be compared and analyzed, and present techniques for rigorously searching a large database are costly and time-consuming. Under all plausible growth projections, the problem will soon become overwhelming.
A custom, very large scale integrated (VLSI) circuit has been designed that could be used for high-speed comparison, analysis, and interpretation of DNA and protein sequences. In a single pass with a statistically rigorous criterion, the systolic array can scrutinize each biosequence segment in a database to determine its homology to an input pattern. Phase I research includes designing and prototyping a printed circuit board for an IBM-compatible AT personal computer; this low-cost circuit board uses a full custom VLSI integrated circuit. Software is being developed to enable the board to interface with commonly used public-domain software packages (e.g., BLAST3), and the performance of the system in actual laboratory settings is being evaluated against current techniques. In Phase II, an interactive analysis platform will be developed on the basis of evaluation results obtained in Phase I.
Low-Cost Massively Parallel Neurocomputing for Pattern Recognition in Macromolecular Sequences
John R. Hartman
Computational Biosciences, Inc.; Ann Arbor, MI 48106
313/426-9050, Fax: -5311, Internet: john@cbi.com
Connectionist (neural network) approaches to pattern recognition and analysis have attracted great interest recently because of their flexibility, ability to learn by example, and ability to "self-organize" to reveal hidden patterns and relationships in the input data set. Neural networks are inherently parallel computational structures and thus potentially excellent candidates for implementation on massively parallel computers. Particularly in the training stage, serial implementations of connectionist models are often limited either by network size or by the number of practical training trials. The explosion in the size of macromolecular sequence databases (such as GenBank®) has created a need for pattern-analysis solutions with superior cost-performance characteristics.
Phase I will implement efficient parallel algorithms for all critical components of a basic multilayer, feed-forward neural network model. These components include a "linear combiner" for the multiplication of connection-weight matrices with input (or error) vectors, a sigmoidal activation function to introduce nonlinearity in neuron behavior, and a complete parallel implementation of the back- propagation (generalized Delta rule) training algorithm. Once implemented, the network will be evalu-ated against alternative parameterizations in a prototype DNA sequence-pattern-recognition task.
Linear or uncross-linked polyacrylamide has been used successfully in capillary electrophoresis to separate nucleic acids. Typical acrylamide concentrations for those applications are 3 to 14% (w/v), with consistencies ranging from almost liquid to moderately viscous. Its relatively fluid nature at typical concentrations and the absence of cross links have caused linear polyacrylamide in planar (slab) gel electrophoresis to be overlooked.
We have described an application of ultrathin (100 micro m) high-viscosity slabs of linear polyacrylamide to planar electrophoresis of nucleic acid fragments [M. T. MacDonell and D. B. Roszak, Gene Anal. Tech. Appl. 10, 10-15 (1993)]. The approach is rapid-end-yield, high-resolution separation of nucleic acid fragments in linear polyacrylamide supports. The mobility of DNA fragments of various lengths in a range of linear polymer concentrations is compared with the mobility for conventional cross-linked gels.
The reptative migration of larger DNA fragments in linear polymers is predictable from models of cross-linked acrylamide and agarose, but the migration of smaller fragments is not entirely consistent with the Ogston model. Relative mobilities for very small DNA fragments are about half those predicted by the Ogston regime.
The tendency of smaller fragments to deviate from predicted mobilities benefits the user because the overall effect is not unlike that of a field-strength gradient. An additional benefit of using water-soluble linear polymers is that quantitative recovery of DNA fragments from these gels requires only that the band be excised and dropped into buffer. The useful range of linear
polymer concentrations in planar sequencing appears to be 20 to 35% (w/v), appropriate for the electrophoretic separation of DNA fragments ranging from 50 to about 2500 bp in length.
This project is developing a new solid-state biosensor technology for biological measurements. The technology is based on merging two relatively different technologies: nucleic acid probe (NAP) DNA hybridization technology with acoustic plate mode (APM) microsensors. Recent advances in the design and operation of APM devices have demonstrated a DNA hybridization sensor principle with excellent sensitivity (nanogram/milliliter), selectivity, and temperature stability when used with a model probe system.
This project is the first attempt at direct electronic in situ sensing of a diagnostically significant DNA gene sequence that codes for the abundant late matrix antigen of Cytomegalovirus. The new DNA biosensor operates in a continuous and therefore homogeneous mode and yet is expected to equal or exceed the sensitivity limits of existing dot-blot technologies. Thus, the biosensor is expected over time to respond in situ to the target DNA. In contrast, the current technology yields a discrete endpoint result. The biosensor response will be an electronic signal, allowing numerous data-handling options.
The Phase I project consists of several components: (1) construction of an optimized dual delay line APM sensing element, (2) design and construction of a dual oscillator measurement system for the optimized APM, (3) optimization of the chemistry employed to attach NAP covalently, (4) evaluation of sensor performance in detecting in situ hybridization of target DNA, and (5) completion of the sensor characterization.
Projects Continuing into FY 1993
Current large-scale genome mapping methodology suffers from a lack of tools for generating specific DNA fragments in the megabase-size range. To address this need, Promega Corporation conducted several Phase I studies. These studies examined the feasibility of developing a set of site-specific endonucleases capable of generating DNA fragments in the 2- to 100-Mb-size range in a single step. Phase I demonstrated that I-PpoI, the group I intron-encoded endonuclease, has excellent potential for use in general molecular biological and human genome research. This potential stems from I-PpoI's ability to be expressed and purified at high yield, its stability and activity under a variety of reaction conditions, and its highly efficient single-site cleavage of genomic DNA embedded in agarose.
Phase II is designed to develop I-PpoI for commercialization and broaden its potential for human genome mapping and analysis of other complex genomes. To accomplish these goals, we have systematically examined I-PpoI's ability to tolerate single base substitutions within its recognition site. In addition, through cross-linking studies and crystallographic analysis, we are investigating the structure of I-PpoI when bound to its recognition site. Using this information, we propose to: (1) identify structural modifications and reaction conditions that enhance I-PpoI's rare-cutting capabilities by relaxing the enzyme's specificity; (2) isolate I-PpoI proteins with mutations that enable the nuclease to cleave at altered recognition sequences; (3) further extend the cutting capabilities of the nuclease by combining approaches 1 and 2; and (4) test the ability of the native, mutant, and modified enzymes to cut human and other complex genomic DNA in agarose.
Thus, our Phase II work should provide a set of conditions and modified I-PpoI enzymes with a range of useful cutting frequencies for complex genome-mapping applications. In addition, this work will provide a systematic structure-function analysis of a novel type of DNA-protein interaction. These results should lead to successful commercialization efforts that will focus on both immediate complex genomic research applications and on the longer-term goal of incorporating I-PpoI into genomic mapping instrumentation.
Projects Continuing into FY 1993
The goals of the Human Genome Project require DNA sequencing techniques that can increase throughput by an order of magnitude or more. Current automated fluorescence sequencing instruments require 5 to 10 h for a single gel run and accommodate a maximum of 36 reaction sets. Capillary gels can be run at higher voltages to yield separation of 300 to 500 bases in 1 to 2 h, but fluorescence-detection methods are limited to analyzing 1 capillary at a time.
Confocal fluorescence imaging can produce great sensitivity to detection and has the geometrical advantage of delivering excitation light through the same lens that collects fluorescence emission. R. A. Mathies and his colleagues have shown that a single scanning-confocal-fluorescence detector can detect DNA fragments in a parallel array of capillaries. Phase I was directed toward testing the feasibility of this technology for high-throughput DNA sequencing, including automated methods for loading and analyzing up to 96 simultaneous reaction sets in 1 to 2 h.
Phase II will identify and test various dye chemistries, lasers, base-coding methods, and software to identify solutions that would be successful in a commercial instrument. Low-viscosity gel matrices will also be tested, and a capillary-array filling and flushing system will be designed and built. A working prototype capillary-array sequencer will be built for testing in a high-throughput laboratory.
In Phase I, spatially defined arrays of oligonucleotide probes were constructed to study the feasibility of DNA sequencing by hybridization. Newly developed techniques in light-directed polymer synthesis were used to construct high-density oligonucleotide arrays, explore kinetic and solvent-related parameters of target hybridization, and read the hybridization positions by epifluorescence microscopy. Specific combinatorial synthesis strategies were designed to address experimental issues of parallel hybridization.
Phase II research will improve the basic technology by developing advanced instrumentation, including high-speed detection systems; upgrading the image-analysis software to handle larger data sets; and formulating algorithms for the design of application-specific arrays of probes. Completion of this work will lead to sequencing instrumentation that could provide
order-of-magnitude improvements in DNA sequencing productivity and generate new technologies for exploring genetic diversity for diagnostic applications.
The objective of this study is to improve the performance efficiency of a nonisotopic detection method for DNA sequencing. After electrophoretic separation, transfer to a nylon membrane, and incubation with a streptavidin-alkaline phosphatase conjugate, DNA sequencing-reaction products labeled with biotinylated primers are imaged by utilizing a chemiluminescent detection procedure that incorporates 1,2-dioxetane substrates for alkaline phosphatase. Upon dephosphorylation, these enzymatic substrates decompose and emit light.
In Phase I, the feasibility of a multiprime strategy for increasing the amount of DNA sequence information from a single membrane was assessed. Multiple sets of DNA sequencing reactions were loaded into a single set of gel lanes, and following electrophoretic separation and DNA transfer to nylon membrane, each set of sequencing reactions was individually detected. Multiple DNA sequencing primers were used, each bearing a unique ligand label, including biotin, digoxigenen, 2,4-dinitrophenyl, or fluorescein. Each set of reactions was detected with a ligand-specific alkaline phosphatase conjugate. The performance of various labels and enzyme conjugates was evaluated, and an optimum procedure was developed for rapid sequential detection of each set of labeled fragments.
Phase II will further optimize the procedure for sequential detection of each set of labeled fragments and will incorporate the newly developed protocols and individual reagents into a DNA-sequencing kit with multiple hapten-labeled primers. A complementary chemiluminescent detection kit containing hapten-specific enzyme conjugates will be produced for sequential identification of overlapping polymerase chain reaction (PCR) products. The creation of a specific membrane optimized for chemiluminescence will be investigated.
In addition, an apparatus for automating the blot-development steps will increase throughput and permit system scaleup. Techniques will also be developed for interfacing the multiplex-labeling DNA-sequencing procedures with PCR amplification and single-stranded DNA template purification. Protocols for these techniques will be incorporated into the research kits and will greatly expand their usefulness.
Electrophoretic Separation of DNA Fragments in Ultrathin Planar-Format Linear Polyacrylamide
Michael T. MacDonell and Darlene B. Roszak
Ransom Hill Bioscience, Inc.; Ramona, CA 92065
619/789-9483, Fax: -6902An Acoustic Plate Mode DNA Biosensor
Douglas J. McAllister
BIODE, Inc.; Cape Elizabeth, ME 04107
207/883-1492, Fax: -1482SBIR Phase II
Increased Speed in DNA Sequencing by Utilizing LARIS and SIRIS to Localize Multiple Stable Isotope-Labeled Fragments
Heinrich F. Arlinghaus
Atom Sciences, Inc.; Oak Ridge, TN 37830
615/483-1113, Fax: -3316Site-Specific Endonucleases for Human Genome Mapping
George Golumbeski, Kimberly Knoche, Susanne Selman, Jim Hartnett, and Lydia Hung
Promega Corporation; Madison, WI 53711
608/274-4330, Fax: /277-2516High-Performance DNA and Protein Sequence Analysis on a Low-Cost Parallel-Processor Array
John R. Hartman and David L. Solomon
Computational Biosciences, Inc.; Ann Arbor, MI 48106
313/426-9050, Fax: -5311SBIR Phase I and II
Rapid, High-Throughput DNA Sequencing Using Confocal Fluorescence Imaging of Capillary Arrays
David L. Barker and Jay Flatley
Molecular Dynamics; Sunnyvale, CA 94086
408/773-1222, Fax: -8343Spatially Defined Oligonucleotide Arrays
Stephen P. A. Fodor
Affymetrix; Santa Clara, CA 95051
408/481-3400, Fax: -0422, Internet: steve_fodor@qmgates.affymetrix.comChemiluminescent Multiprimed DNA Sequencing
Chris S. Martin, Corinne E. M. Oleson, and Irena Bronstein
Tropix, Inc.; Bedford, MA 01730
617/271-0045, Fax: /275-8581