Skip to main content
Publication

Background and Roadmap for a Distributed Computing and Data Ecosystem

by Mallikarjun Shankar, Eric Lancon
Publication Type
ORNL Report
Publication Date

The science drivers for using distributed computing and data resources are emerging from many science domains and have been documented through several recent Department of Energy (DOE) workshop reports. The increased usage and availability of geographically distributed computing and storage resources also highlight the need for easy-to-use tools and procedures that can handle this paradigm transparently. Aiming to meet each laboratory’s programmatic needs, a large variety of computing resources are available throughout the DOE complex under various access policies, operating systems, and hardware configurations. However, currently no straightforward mechanisms or policies exist for a given researcher or group of researchers to easily access resources at different locations across the DOE laboratories. At the same time, DOE research programs have supported the research and development of technologies and tools that could enable researchers to access distributed data and computing resources globally. Academic and industry activities have also developed mechanisms for distributed federation of resources. We observe that such existing capabilities can be harnessed to create mechanisms to equip researchers to work seamlessly across facilities.

We envision the creation of a DOE Office of Science (SC) wide federated Distributed Computing and Data Ecosystem (DCDE) which comprises tools, capabilities, services and governance policies to enable researchers to seamlessly use a large variety of computing-related resources (i.e., scientific instruments, local clusters, large facilities, storage, enabling systems software, and networks) end-to-end across laboratories within the DOE environment. A successful DCDE would present to the researchers a variety of distributed resources through a coherent and simple set of interfaces and allow them to manage data and related computations throughout the data management lifecycle. Envisioned as a cross-laboratory environment, a DCDE should establish a governance body that includes the relevant stakeholders to create effective use and participation guidelines.

To validate the DCDE approach, the working group proposed the development of a prototype which implements in a coherent and progressive manner the main components of a DCDE. The prototype will help in defining a general set of recommendations, supported by implementation experiences, for expanding the DCDE to the whole lab complex and to produce an applicable governance model