In this issue...
Also available in pdf.
1997 Santa Fe Highlights
Joint Genome Institute (JGI) Comes of Age
JGI and Bermuda Quality Sequence
Grants Awarded for JGI Collaboration
JGI Sequencing Clones
Sequencing at NIH NHGRI
Data Surge Challenges Informaticists
Genome Annotation: Informatics Advances Needed for Age of Functional Genomics
ELSI: Rapid Progress Accelerates Societal Impact of Genome Research
1999 DOE HGP Meeting Set for California
Human Genome Project Administration
New 5-Year Goals, Project Midpoint
DOE, NIH Discuss Informatics
JASON Group Review
BER Genome Instrumentation Research
In the News
Private-Sector Sequencing Plan
Bang for the Buck: Government-Backed Research Underpins Potentially High Payoff Ventures
Palmisano Joins DOE OBER
DNA Files series to be on NPR
HUGO Addresses Sample Collection
Sickle Cell Mice May Lead to New Treatments
TIGR Sequencing 6 More Microbes
Tuberculosis Microbe Sequenced
C. Elegans Sequencing Nears Finish
HGMIS Website Restructured
cDNA Cloning Workshop Identifies Critical Issues
Survey Identifies Growing Need for Synchrotron Analyses
Report on Functional Consequences of Gene Expression
Book on Tuskegee Conference
Book Focuses on Biomarker Implications, Conference Proceedings
Genome Analysis Protocol Handbook
Software and the Internet
Mouse Genome Informatics Release 2.0
New System Identifies Polymorphisms
DOE Supports Web Site for 1997 AAAS Genome Symposium
Expressed Human Genome Database
NHGRI Initiates Mailing List
U.S. Genome Research Funding
Meeting Calendars & Acronyms
Genome and Biotechnology Meetings
Training Courses and Workshops
HGN archives and subscriptions
HGP Information home
Data Surge Challenges Informatics Developers
The explosive growth of sequence and biological information poses pressing challenges for data acquisition, representation, access, and analysis. Some highlights from informatics sessions at the Santa Fe workshop follow.
bioWidgets: Adaptable, Reusable Modules for Viewing Data
Many software analysis applications commonly are tailored to fit resources available at a particular site. The bioWidgets toolkit philosophy of Chris Overton's team [University of Pennsylvania (Penn)], however, is to use a component-based approach to design adaptable and reusable software, easily incorporated in a variety of applications and deployable in modules, that promotes interaction among applications. Jonathan Crabtree described the team's efforts to develop and deploy graphical user interfaces for visualizing molecular, cellular, and genomic information. The current implementation includes widgets that display sequences, maps, BLAST results, chromosomes, and sequence alignments. The group also is developing interfaces for data stored in such distributed heterogenous databases as the Genome Database, Genome Sequence DataBase, Entrez, and ACeDB and is creating a consortium of bioWidget developers and users to create standards. All bioWidgets are implemented in Java for Web distribution.
Querying Across Databases with BioKleisli
Sue Davidson (Penn) described a new suite of tools that permits researchers to pose complex questions over the distributed, heterogenous sources housing most genome-related data. Answering the query, "Find human sequence entries on human chromosome 22 overlapping q12," for example, would now require access to three separate databases. The new system, which performs integration "on the fly" while allowing simultaneous structural source-data transformations, is based on the powerful Kleisli integration system developed at Penn. Together with the high-level Collection Programming Language (called CPL), bioKleisli can be used to integrate data through dynamic user-defined views or to create specialized data warehouses allowing fast access (http://www.pcbi.upenn.edu).
Improved BCM Search Launcher
Kim Worley [Baylor College of Medicine (BCM)] reported on the enhanced sequence-analysis search services provided by the BCM Search Launcher. Search Launcher is an easy-to-use interface that organizes Web sequence-analysis servers according to function and provides a single point of entry for related searches. It adds hypertext links for easy access to Medline abstracts, related sequences, and other information. A BLAST Enhanced Alignment Utility (BEAUTY) tool makes it easier to identify weak but functionally significant matches in BLAST protein database searches. Recent enhancements make BEAUTY searches available for DNA queries (BEAUTY-X) and for gapped alignment searches (using WU-BLAST2). For users who need to perform a particular search on a number of sequences at once, the Batch Client provides access to all searches available from the BCM Search Launcher Web pages in a convenient drag-and-drop (Macintosh) or command line (UNIX, PC) interface. Future developments are focusing on the analysis of large-scale sequences to support the efforts of the Genome Annotation Consortium (see sidebar above).
WIT/WIT2: Reconstructing Metabolism Analysis of the increasing number of fully or partially sequenced small genomes can serve as the foundation from which to look at more complex genomes. Evgeni Selkov and Ross Overbeek (both at Argonne National Laboratory) discussed the reconstruction of accurate metabolism models for 29 of these small organisms. Using sequence data supplemented with biochemical and phenotypic data, the group has made reconstructions (some based on still-incomplete sequence data) available via the WIT/WIT2 system. WIT2 is a UNIX-based system in two parts: a Web-based, data-access system and a set of batch tools offering extensible data-query access (http://wit.mcs.anl.gov/WIT2/wit.html).
WIT/WIT2 reconstructions are based on the metabolic pathway (MPW) collection, which includes over 2800 diagrams covering primary and secondary metabolism, membrane transport, signal-transduction pathways, intracellular traffic, transcription, and translation. Selkov observed that identifying universal metabolic aspects and gene families will lead to integrated understanding of metabolic evolution and to technologies for developing higher-level functional models. In the current public release of MPW (http://wit.mcs.anl.gov/MPW), the coding, based on the pathways' logical structure, is represented by objects commonly used in electronic circuit design. Such design facilitates diagram drawing and editing and enables automation of basic simulation operations.
Return to Top of Page
The electronic form of the newsletter may be cited in the following
Human Genome Program, U.S. Department of Energy,
Human Genome News (v9n3).