Abstract
Task-based programming models and execution paradigms provide a means to decompose a computation by expressing it as a graph in which each node represents a specific computation operating on memory objects and the edges define the dependencies in the execution flow. In this execution model, independent nodes in the graph can be executed concurrently in different computing devices, making it suitable for heterogeneous systems in which computing devices with different architectures coexist. However, careful memory orchestration across heterogeneous devices is needed because copies of the same memory object may reside in multiple devices during execution. Manually ensuring such an orchestration is quite challenging. Not only must an application developer guard against race conditions, but they must also optimize data movement between the host and devices because unnecessary data movement significantly impacts performance. To mitigate these challenges, we enhance the IRIS heterogeneous runtime and introduce IRIS-MEMFLOW–a data flow–enabled portable memory abstraction for seamlessly orchestrating memory in diverse heterogeneous computing environments. By using data-flow analysis, IRIS-MEMFLOW guards against race conditions while multiple heterogeneous devices access memory objects. IRIS-MEMFLOW also optimizes data movement between the host and devices without manual intervention. As a result, IRIS provides improved programming productivity, performance, and portability for multidevice heterogeneous executions in high-performance computing and cloud systems that run diverse architectures from different vendors. The efficacy of IRIS-MEMFLOW is evaluated through experiments that show its capability in terms of programming productivity, multidevice heterogeneity, portability, and low overhead versus the state of the art.