Skip to main content
SHARE
Publication

A Case Study of MPI Over Long Distance Connections...

by Nageswara S Rao, Neena Imam, Swen Boehm
Publication Type
Conference Paper
Book Title
Proceedings of SYSCON2019
Publication Date
Page Numbers
1 to 4
Conference Name
13th Annual IEEE International Systems Conference (SysCon 2019)
Conference Location
Orlando, Florida, United States of America
Conference Sponsor
IEEE
Conference Date
-

Scientific workflows are increasingly being distributed across wide-area networks, and their code executions are expected to span across geographically dispersed computing systems. MPI has been extensively used to support communications for distributed computations, typically, over compute clusters and high-performance systems within a single facility. We present a case study of performance of MPI basic operations over long distance connections, wherein TCP is used for the underlying transport. We present measurements of execution times of MPI codes that utilize MPI Sendrecv operations over emulated 10Gbps connections with 0-366ms round-trip times, including the longest one spanning the globe. They demonstrate that basic MPI codes can be sustained over long distance connections under external packet loss rates up to 10%. They also highlight the qualitative effects of losses which manifest as increased execution times as a consequence of TCP’s loss recovery process.