PDF
Introduction to the Workshop
URLs Provided by Attendees
- Abstracts
- Mapping
- Informatics
- Sequencing
- Instrumentation
- Ethical, Legal, and Social Issues
- Infrastructure
The electronic form of this document may be cited in the following style:
Human Genome Program, U.S. Department of Energy, DOE Human Genome Program Contractor-Grantee Workshop IV, 1994.
Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected.
|
A Conceptual Model for Genome Mapping Data
Mark Graves and Charles B. Lawrence
Department of Cell Biology, Baylor College of Medicine, Houston, TX 77030.
Human genome mapping projects are generating large amounts of data which must be stored in databases to be easily accessible. A good first step in designing any database is to develop a conceptual schema which captures the concepts and relations of the domain. We have developed a simple conceptual model based on graphs which has proven itself useful in the design of several mapping databases. We present the conceptual model and some of the mapping schemas we have developed within the model.
Conceptual modeling is the process of describing the concepts and relationships of a domain that are to be stored in a database [1]. The process takes place within a theoretical framework called a conceptual model. A conceptual model is a data model which formalizes the representation and manipulation of concepts and relationships. The conceptual model defines the language used to describe the domain. A conceptual model is used by a database developer to describe the aspects of a domain which are to be captured by a database. The description of a domain is called a conceptual schema.
One foundation of conceptual models is graphs. Graphs have proven themselves useful for representing complex domains, may be stored and queried in a database [2], and have a strong mathematical foundation in graph theory. A graph conceptual model is a data model which represents the connections between the concepts and relationships in a domain as a graph.
Graph models are useful for representing mapping data. We have developed conceptual schemas for several laboratory databases, including gene (cDNA) mapping, discovery of dinucleotide repeat markers, and YAC-STS hybridizadon experiments. Graphs can represent the location of genes and markers, order and distance between them [3], and containment relationships. Graph models may also be decomposed into binary relationships which are compatible with physical map viewing tools, such as SIGMA [4].
We have found that conceptual models based on graphs are useful tools for developing mapping databases. They are also a convenient mechanism for communication between biologists and database developers, which helps guide the database development toward a system which meets the needs of genome mappers.
This work was supported by the W.M. Keck Center for Computational Biology and the Baylor Human Genome Center funded by the NIH National Center for Human Genome Research. In addition, C.B.L. was supported by a grant from the Department of Energy, and M.G. was supported by a fellowship from the National Library of Medicine and a Department of Energy Human Genome Postdoctoral Fellowship.
[1] Brodie, Michael, John Mykopoulos, and Joachim Schmidt, editors (1984). On conceptual modelling: Perspectives from artificial intelligence, databases, and programming languages Springer-Verlag, New York.
[2] Graves, Mark (1994). Querying a Genome Database Using Graphs. In H. Lim, ed., Proceedings of the Second International Conference on Bioinformatics and Genome Research World Scientific Publishing.
[3] Graves, Mark (1993).Integrating order and distance relationships from heterogeneous maps. In L. Hunter, D. Searls and J. Shavlik, editors, Proceedings of the First International Conference on Intelligent Systemsfor Molecular Biology (ISMB-93), Menlo Park, CA. AAAI/MIT Press.
[4] Cinkosky, M.J., Fickett, J.W., Barber, W.M., Brdgers, M.A., and Troup, C.D. (l992) SIGMA: A system for integrated genome map assembly. Los Alamos Science 20: 267-269.
|