Skip to main content

It's About Time: Multi-Resolution Timers for Scalable Performance Debugging

by James White
Publication Type
Conference Paper
Book Title
CUG 2007 Proceedings
Publication Date
Page Number
Publisher Location
Philomath, Oregon, United States of America
Conference Name
CUG 2007
Conference Location
Seattle, Washington, United States of America
Conference Sponsor
Cray User Group
Conference Date

Traditional performance profiling of highly parallel applications does not always give enough information to diagnose performance bugs, particularly those caused by load imbalances and performance variability, yet the data files for such profiling can grow linearly with parallel task count. In response to these limitations, I have developed application timers designed to limit data and reporting volumes at high task counts without dispersing the signals of load imbalance and performance variability. I will describe use of these timers to diagnose actual performance bugs running the Parallel Ocean Program on a Cray XT4.