Steve Rozen, John Lehman, Lincoln Stein, Nathan Goodman
Whitehead Institute for Biomedical Research, One Kendall Square, Cambridge MA 02139.
We are constructing a data-management component tuned to the requirements of genome
applications. This component will offer many of the services commonly provided by database
management systems (DBMSs), including:
We do not seek to provide extremely high transaction rates and sophisticated query optimizers
on a flat data model; existing high-end commercial DBMSs provide such facilities. Instead, the
core of this genome data manager is designed to:
The core data manager can be customized to support a variety of specific data models-for example ACEDB,[1] OPM,[2] or ODMG-9x[3]-and applications-for example to back WWW servers, for laboratory notebook databases, and for organism and community databases. Since the data manager will be highly portable and free of licensing fees, we expect that it will be attractive as a database for distributing copies of organism or community databases.
We will be reporting progress on the core data manger's architecture and interface at http://www.broad.mit.edu/informatics/cdm.html, and we solicit comments on its design. We are currently extending the core data manager to provide ACEDB compatibility for schemas, data-transfer files, client/server interface, and-if needed by potential users-function-call application program interface. We also contemplate extending the core data manager to provide other specific data models depending on the interest of potential users.
The core data manager is being constructed on top of transactional libdb (Berkeley UNIX's dbopen(3) with concurrency control and recovery added).[4] We are collaborating with libdb's authors to provide a portable, POSIX-compliant implementation of transactions for this library.
*Supported by a grant from the U. S. Department of Energy under contract DE-FG02-95ER62101.
[1] R. Durbin and J. Thierry-Mieg. A C. elegans database, 1991. Documentation, code and data available from anonymous ftp servers at ncbi.nlm.nih.gov.
[2] I.-M. A. Chen and V. M. Markowitz. An overview of the Object Protocol Model (OPM) and the OPM data management tools. Information Systems, 20 (5):393-418, July 1995.
[3] R. Cattell, T. Atwood, J. Duhl, G. Ferran, M. Loomis, and D. Wade. The Object Database Standard: ODMG-93 Release 1.1. Morgan Kaufmann Publishers, 1994.
[4] M. Seltzer and M. Olson. LIBTP: Portable, modular transactions for UNIX. USENIX Winter 1992 Technical Conference, 1992.