Skip to main content
Research Highlight

Layout-Aware Data Scheduler for Bulk Data Transfer over Terabit Networks

Achievement: Developed an end-to-end data transfer framework, named LADS, for bulk data transfer for terabit networks that is optimized for parallel file systems on both the source and sink. This work includes techniques to improve data transfer under congestion, without impacting other users of the parallel file systems.


Significance and Impact: This work enhances our ability to transfer bulk data between parallel file systems from different institutions over long-distance by using terabit networks, while minimizing the impact on other users of the parallel file system.


Research Details:

  • Develop a layout-aware data scheduler for bulk data transfer that coordinates accesses to the parallel file system and the terabit network.
  • Leverage high-performance capabilities of modern networks (e.g., operating system by-pass).
  • Advanced buffering mechanisms to minimize the effect of congestion of the parallel file system.

While future terabit networks hold the promise of significantly improving big-data motion among geographically distributed data centers, significant challenges must be overcome even on today’s 100 gigabit networks to realize end-to-end performance. Multiple bottlenecks exist along the end-to-end path from source to sink, for instance, the data storage infrastructure at both the source and sink and its interplay with the wide-area network are increasingly the bottleneck to achieving high performance. In this paper, we identify the issues that lead to congestion on the path of an end-to-end data transfer in the terabit network environment, and we present a new bulk data movement framework for terabit networks, called LADS. LADS exploits the underlying storage layout at each endpoint to maximize throughput without negatively impacting the performance of shared storage resources for other users. LADS also uses the Common Communication Interface (CCI) in lieu of the sockets interface to benefit from hardware-level zero-copy, and operating system bypass capabilities when available. It can further improve data transfer performance under congestion on the end systems using buffering at the source using flash storage. With our evaluations, we show that LADS can avoid congested storage elements within the shared storage resource, improving input/output bandwidth, and data transfer rates across the high speed networks. We also investigate the performance degradation problems of LADS due to I/O contention on the parallel file system (PFS), when multiple LADS tools share the PFS. We design and evaluate a meta-scheduler to coordinate multiple I/O streams while sharing the PFS, to minimize the I/O contention on the PFS. With our evaluations, we observe that LADS with meta-scheduling can further improve the performance by up to 14% relative to LADS without meta-scheduling.