DYFLOW: A flexible framework for orchestrating scientific workflows on supercomputers

by Swati Singhal, Alan Sussman, Matthew D Wolf, Kshitij V Mehta, Jong Youl Choi

Publication Type

Conference Paper

Journal Name

50th International Conference on Parallel Processing Workshop

Book Title

ICPP Workshops '21: 50th International Conference on Parallel Processing Workshop

Publication Date

August, 2021

Page Numbers

1 to 11

Issue

Conference Name

50th International Conference on Parallel Processing Workshop (ICPP)

Conference Location

Chicago, Illinois, United States of America

Conference Sponsor

IEEE

Conference Date

Aug 9, 2021

View DOI Listing

Abstract

Modern scientific workflows are increasing in complexity with growth in computation power, incorporation of non-traditional computation methods, and advances in technologies enabling data streaming to support on-the-fly computation. These workflows have unpredictable runtime behaviors, and a fixed, predetermined resource assignment on supercomputers can be inefficient for overall performance and throughput. Inability to change resource assignments further limits the scientists to avail of science-driven opportunities or respond to failures.

We introduce DYFLOW, a flexible framework that orchestrates scientific workflows on supercomputers based on user-designed policies. DYFLOW compartmentalizes orchestration stages into simplified constructs, and end-users can program and reuse them according to their workflow requirements through an easy-to-use interface. These constructs hide the intricacies involved in runtime management from end-users, for instance, procurement of information to understand the workflow state, assessment, and supervision of the runtime changes. DYFLOW is designed to work alongside existing workflow management systems and reuse the available (static) support for workflow management. We have integrated DYFLOW with an existing workflow management tool as a demonstration. With experiments performed on use cases from three types of scientific workflows and two different parallel architectures, we show that DYFLOW achieves the desired orchestration incurring a small cost to carry out the runtime changes.

DYFLOW: A flexible framework for orchestrating scientific workflows on supercomputers

Abstract

Researchers

Organizations