Genome Informatics Section 

DOE Human Genome Program Contractor-Grantee Workshop VII 
January 12-16, 1999  Oakland, CA


78a. Our Vision for a New Macromolecular Structure Database --The New Protein Databank  

Helen M. Berman1, Gary Gilliland2, Peter Arzberger3, Phil Bourne4, John Westbrook5, Phoebe Fagan6 

1Rutgers University 
2National Institute of Standards and Technology (NIST) 
3UC San Diego Supercomputer Center 
4San Diego State College 
5Rutgers University 
6NIST 

The rate of Growth in the number of structures experimentally determined by X-ray crystallography and NMR methods promises to increase dramatically as the focus continues to shift toward understanding the sequence-structure-function relationships. Structure data will only fully enable our ability to understand function if they are well annotated, integrated at all levels of detail with related sources of biological information, and widely disseminated to the growing user community. These data must be consistent so that the full potential of discovery through comparative analysis is available.

To this end, groups from Rutgers, the State University of New Jersey, the San Diego Supercomputer Center (SDSC) of the University of California, San Diego (UCSD), and the National Institutes of Standards and Technology (NIST) have formed the Research Collaboratory for Structural Bioinformatics (RCSB). The combined experience of members of the RCSB in structure data processing and analysis covers data validation, data modeling, database development, query languages, and visualization tool development. The RCSB has developed and currently maintains nine publicly available structural biology databases. 

For its first collaborative project, the RCBS has created the new Protein Data Bank, or Micromolecular Structure Database (MSD), which has several key components: 
1) high throughput, well-annotated data produced in a single pass; 
2) a system that will enable the creation of a uniform archive; 
3) powerful query capabilities across multiple databases containing native and derived data via an integrated query interface; 
4) dynamic cross-links to information in other biological databases; 
5) review and analysis of structure and sequence neighbors. 

The Data Uniformity process for the new Protein Data Bank will be presented, user input sought, and plans for the future discussed, with emphasis on enabling the field of structural bioinformatics. 


 
Home Sequencing Functional Genomics
Author Index Sequencing Technologies Microbial Genome Program
Search Mapping Ethical, Legal, & Social Issues
Order a copy Informatics Infrastructure