Introduction to the Workshop
URLs Provided by Attendees
- Ethical, Legal, and Social Issues
The electronic form of this document may be cited in the following style:
Human Genome Program, U.S. Department of Energy, DOE Human Genome Program Contractor-Grantee Workshop IV, 1994.
Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected.
A Prototype 2nd Generation Database schema for large-scale Physical Mapping
Tom Slezak (firstname.lastname@example.org), Mark Wagner, and T. Mimi Yeh
Human Genome Center, L-452, Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, Livermore, California 94550
The Human Genome Center at LLNL has been developing a relational database for the last 5 years to support our work on chromosome 19. We had deliberately chosen to make this database specific to the immediate task at hand since we knew that we would gain much experience before we needed to expand to process other genetic real estate.
Closure of the physical map of ch19 is now in sight and we are preparing to expand our mapping work in a wide variety of directions, including (portions of) human ch2, certain gene families across the entire human genome, regions of interest in other genomes with conserved synteny with respect to humans, and possible (eventual) work on various bacterial, plant, and other animal genomes. Separate databases for each chromosome or genome are not well-suited for the comparative biology that lies in our future. We need to scale up our database to be able to handle queries on physical mapping data that span all our objects and their inter-relationships regardless of species.
To accomplish this we are drawing on concepts employed in our "generic map object" database for automated map integration on ch19 as well as ideas taken from the excellent GSDB schema (Michael Cinkosky, et. al.). Major concepts include: All objects will have a unique, permanent identifier; entries will be labeled with owner(s) and date(s); objects and relations will be highly abstracted (single base-class tables for all clones and probes, a single hybridization results table, etc.); laboratory notebook details will be cleanly separated from results, storage, maps, citations, taxonomy, and attributes. This approach will allow us to reference external databases when adequate ones are available (e.g., citations, taxonomy, sequence) for on-line access. It also allows us to make major changes in one component (say, storage) without causing collateral damage to unrelated portions of the schema.
A key point of our design has been to focus on capturing appropriate abstract atomic relationships, which allow subsequent creation of multiple (possibly conflicting) definitions of higher-level map objects. Unlike traditional object-oriented approaches, this relation-oriented approach has proven in our hands in our ch19 database to be highly flexible with respect to changes in biology, user needs and the desire for multiple simultaneous viewpoints. Our early experience with a prototype of this schema will be detailed.
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory contract no. W-7405-ENG-48.