Skip to main content
SHARE
Publication

Estimation of RTT and Loss Rate of Wide-Area Connections Using MPI Measurements...

by Nageswara S Rao, Neena Imam, Zhengchun Liu, Rajkumar Kettimuthu, Ian Foster
Publication Type
Conference Paper
Book Title
Proceedings of Workshop Innovating the Network for Data Intensive Science (INDIS)
Publication Date
Page Numbers
1 to 8
Publisher Location
Denver, Colorado, United States of America
Conference Name
Workshop Innovating the Network for Data-Intensive Science (INDIS)
Conference Location
Denver, Colorado, United States of America
Conference Sponsor
SC2019
Conference Date
-

Scientific computations are expected to be increasingly distributed across wide-area networks, and the Message Passing Interface (MPI) has been shown to scale to support their communications over long distances. These computations should account for certain network parameters to ensure an effective execution, for example, by avoiding highly congested and long connections. The execution times of MPI basic operations reflect the connection parameters, including the Round Trip Time (RTT) and loss rate. We describe five machine leaning methods to estimate the connection RTT and loss rate using execution times of MPI basic operations. We utilize execution time measurements of MPI Sendrecv operations collected over emulated 10 Gbps connections with 0-366 ms round-trip times, wherein the longest connection spans the globe, under up to 20% periodic losses. These methods provide disparate, namely, linear and non-linear, and smooth and non-smooth, estimates of RTT and loss rate. Our results show that accurate estimates can be generated at low loss rates but they become inaccurate at loss rates 10% and higher. Overall, these results constitute a case study of the strengths and limitations of machine learning methods in inferring network-level parameters using application-level measurements.