![]() |
|
| Archive Edition | |
|
Sponsored
by the U.S. Department of
Energy Human Genome Program
|
Santa Fe, New Mexico, November 13-17, 1994
|
Introduction to the Workshop
The electronic form of this document may be cited in the following style: Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected. |
Integrated Informatics Support for Large-Scale Sequencing at the LBL Human Genome CenterE. Theil, A. Aggarwal, D. Davy, F. Eeckman, T. Fleming, V. Markowitz, J. McCarthy, S. Pitluck, E. Veklerov, M. Zorn After approximately two years of effort, the LBL Human Genome Center has demonstrated that it can sequence DNA at a sustained rate of approximately 750kb per each six person sequencing team. The Informatics Group has supported this project with a variety of software programs and databases, most of which have been described at earlier conferences or are detailed in separate abstracts for this meeting. LBL now is planning to scale up its sequencing effort significantly. In order to accomplish this while continuing to lower the cost per base, it will not be enough to merely hire additional sequencing teams and support them with existing software. What is required instead is a more fully integrated and automated system in which the data produced by the directed strategy developed at LBL is both captured and modeled in software, so that all the information generated prior to sequencing can be exploited subsequently, in assembly and dissemination. Furthermore, this more highly integrated system will help to increase the level of quality control on data and operations by identifying errors and providing computer assistance in troubleshooting. We do not believe that there are general solutions available, but we are able to use a number of robust software components that have been developed elsewhere in conjunction with our own software to form an integrated system tailored to our own particular needs. We discuss a system architecture that is designed to capture data either automatically or with manual input and which is gradually replacing personal laboratory notebooks with a unified view of the data at any moment. The first pieces of this system will consist of modules for dealing with automatic inspection of ABI sequencing runs, automatic trace cutting and browsing. Other modules recently introduced are automatic generation of transposon maps and the ability to perform post mortems by comparing maps based on actual sizes of sequenced clones with those based on estimated sizes from electrophoresis. This helps to reduce the number of gaps encountered when assembling the sequence. Another important component now under development is the introduction of a figure of merit associated with each base call. This will be a significant aid as we move to more automatic editing of sequences and the systematic demonstration of quality in sequenced data. In order to integrate information from and for these modules, our Syndb database will store operational data, finished sequence, and up-to-date maps, all linked to each other. Application modules will communicate with the database to both access and update data as it is produced. Syndb will also support more than one analysis of the basic data (typically gels) in order to troubleshoot inconsistencies (typically false positives) as required.
|
Send the url of this page to a friend
To read pdf files, download the free Acrobat Reader software.
Last modified: Wednesday, October 29, 2003
Home * Contacts * Disclaimer
Base URL: www.ornl.gov/hgmis
Site sponsored by the U.S. Department of Energy
Office of Science, Office
of Biological and Environmental Research, Human
Genome Program