Skip to main content
SHARE
Publication

Data Jockey: Automatic Data Management for HPC Multi-Tiered Storage Systems...

Publication Type
Conference Paper
Book Title
2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Publication Date
Page Numbers
511 to 522
Publisher Location
Washington, District of Columbia, United States of America
Conference Name
33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019)
Conference Location
Rio de Janeiro, Brazil
Conference Sponsor
IEEE
Conference Date
-

We present the design and implementation of Data Jockey, a data management system for HPC multi-tiered storage systems. As a centralized data management control plane, Data Jockey automates bulk data movement and placement for scientific workflows and integrates into existing HPC storage infrastructures. Data Jockey simplifies data management by eliminating human effort in programming complex data movements, laying datasets across multiple storage tiers when supporting complex workflows, which in turn increases the usability of multitiered storage systems emerging in modern HPC data centers.

Specifically, Data Jockey presents a new data management scheme called “goal driven data management” that can automatically infer low-level bulk data movement plans from declarative high-level goal statements that come from the lifetime of iterative runs of scientific workflows. While doing so, Data Jockey aims to minimize data wait times by taking responsibility for datasets that are unused or to be used, and aggressively utilizing the capacity of the upper, higher performant storage tiers.

We evaluated a prototype implementation of Data Jockey under a synthetic workload based on a year’s worth of Oak Ridge Leadership Computing Facility’s (OLCF) operational logs. Our evaluations suggest that Data Jockey leads to higher utilization of the upper storage tiers while minimizing the programming effort of data movement compared to human involved, per-domain adhoc data management scripts.