![]() |
|
| Archive Edition | |
|
Sponsored
by the U.S. Department of
Energy Human Genome Program
|
Santa Fe, New Mexico, November 13-17, 1994
|
Introduction to the Workshop
The electronic form of this document may be cited in the following style: Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected. |
BioPOET: Large Scale Sequence Analysis On Workstation FarmsManfred D. Zorn, Jane F. Macfarlane, Rob Armstrong [1], Michael H. Cooper, and Nicholas C. Weaver The rate at which new sequences are being generated has dramatically increased. A standard procedure to analyze sequences is to compare them with already known sequences. Thus longer sequences are matched against increasingly larger databases of sequences. The available sophisticated computing technology to tackle such problems, e.g., faster machines, parallel processing, distributed computing, exists already. However, the use of these resources requires detailed knowledge of the particular resources to optimally access them. We developed a framework that allows to partition the necessary tasks and execute them on a workstation farm. A master reads the database and creates a task for each sequence. The workers request a new task, compare the database with the query sequence, and report the results back to the master. A graphical user interface allows easy input and parameter specifications, interacts with a network server to launch the program, and displays the final results graphically. The tasks themselves use existing software, e.g., filter [1] to search the database efficiently and align [2] to generate the final alignment. The framework makes use of the Parallel Object-oriented Environment and Toolkit, POET, that is modeled after the X11 toolkit and enables both high and low level control of the computational methods. The object-oriented programming paradigm allows data encapsulattion and methods to hide implementation details so as to present a unified object view to the user. Existing software can be adapted to exploit the power of parallel processing. Thus sequence analysis can be performed transparently to the user in reasonable time, where POET divides either the query sequence or the database in multiple pieces to run on parallel computers or a number of workstations in a distributed environment. We will present a prototype system that integrates sequence analysis into the sequencing protocol and performs comparisons of sequences on a workstation farm. The framework has been implemented using the C++ language and uses PVM as communication package. The graphical user interface is implemented in VisualWorks\Smalltalk from ParcPlace Systems. [1] Chang, W. and Marr T., Approximate String Matching and Local Similarity, in Combinatorial Pattern Matching, Springer Verlag, 1994. This work was supported by the Director, Office of Energy Research, Office of Health and Environmental Research, Human Genome Program, of the US Department of Energy under Contract No. DE-AC03-76SF00098.
|
Send the url of this page to a friend
Last modified: Wednesday, October 29, 2003
Home * Contacts * Disclaimer
Base URL: www.ornl.gov/hgmis
Site sponsored by the U.S. Department of Energy
Office of Science, Office
of Biological and Environmental Research, Human
Genome Program