Introduction to the Workshop
URLs Provided by Attendees
- Ethical, Legal, and Social Issues
The electronic form of this document may be cited in the following style:
Human Genome Program, U.S. Department of Energy, DOE Human Genome Program Contractor-Grantee Workshop IV, 1994.
Abstracts scanned from text submitted for November 1994 DOE Human Genome Program Contractor-Grantee Workshop. Inaccuracies have not been corrected.
Architecture of the Genome Data Base Version 6.0
Peter Li, David D. Marquette, Christopher W. Brunn, and Kenneth H. Fasman
Division of Biomedical Information Sciences, Johns Hopkins University School of Medicine, 2024 E. Monument St., Baltimore MD 21205-2100
The existing architecture of the Genome Data Base has a number of limitations. The first limitation is its evolving monolithic design -- all data exist in one physical database. This increases maintenance, limits growth, slows deployment, and reduces performance of GDB. The second limitation is its awkward character-based user interface. The lack of a graphical user interface (GUI) makes it awkward for the many GDB users that are familiar with Windows or Mac interfaces. The third limitation is the requirement of SQL for application development. All applications that access GDB must operate at the SQL level and therefore need a comprehensive understanding of the complex relational schema. In addition, run-time copies of SQL-based applications require a license fee.
The architecture of GDB 6.0 addresses the above limitations through the following requirements. The first is an open, federated architecture -- the monolithic database will be partitioned vertically into individual databases such that each can operate independently of the others. This will provide a framework for simpler management, better growth, and faster deployment of GDB. The second requirement is a conceptual modeling of the databases. As DBMS vendors come and go and physical architectures wax and wane in popularity, we need to protect our investment in the domain semantics captured in the database schema. Therefore, we need a conceptual modeling environment that allows migration between DBMS vendors and physical architectures, and allows future adaptation to nonrelational systems. The third requirement is a graphical interface. The interface to GDB V6 should support X Windows, PC, and Mac platforms. To minimize platform-specific code development and maintenance, we also need a multiplatform software development system. The fourth requirement is a method for conceptual data access. In order to reduce the complexity of the schema for application developers and end-users, we want to isolate the physical design from the much simpler conceptual design. Therefore, the interaction between clients (applications) and servers (databases) will be through a layer that translates the physical data to and from the conceptual data. With this approach, a database can be nontraditional but the client can still interact with the database. Furthermore, if the translation layer is kept on the server side, then the end users will not need to pay for application licensing fees.
To fulfill these requirements we are developing GDB 6.0 with a mixture of commercial software, public domain tools, and custom development. The back end issue of conceptual modeling is addressed by the Object Protocol Model tools from Victor Markowitz of Lawrence Berkeley Laboratory. It provides a method for specifying an object-oriented schema and provides a conceptual level API to the resultant physical database. The front end issue of a multiplatform GUI is addressed by the Galaxy programming suite from Visix, Inc. It is a C++ environment that allows one set of source code to be compiled for many computer platforms. The conceptual communication between the front end and the back end is addressed by building an "object broker" that will perform the translations between the physical data and the conceptual data.