Skip to main content

Toward Large-Scale Image Segmentation on Summit...

Publication Type
Conference Paper
Book Title
Proceedings of the International Conference on Parallel Processing
Publication Date
Page Numbers
1 to 11
Conference Name
49th International Conference on Parallel Processing - ICPP
Conference Location
Vancouver,, Canada
Conference Sponsor
Conference Date

Semantic segmentation of images is an important computer vision task that emerges in a variety of application domains such as medical imaging, robotic vision and autonomous vehicles to name a few. While these domain-specific image analysis tasks involve relatively small image sizes (∼ 102 × 102), there are many applications that need to train machine learning models on image data with extents that are orders of magnitude larger (∼ 104 × 104). Training deep neural network (DNN) models on large extent images is extremely memory-intensive and often exceeds the memory limitations of a single graphical processing unit, a hardware accelerator of choice for computer vision workloads. Here, an efficient, sample parallel approach to train U-Net models on large extent image data sets is presented. Its advantages and limitations are analyzed and near-linear strong-scaling speedup demonstrated on 256 nodes (1536 GPUs) of the Summit supercomputer. Using a single node of the Summit supercomputer, an early evaluation of a recently released model parallel framework called GPipe is demonstrated to deliver ∼ 2X speedup in executing a U-Net model with an order of magnitude larger number of trainable parameters than reported before. Performance bottlenecks for pipelined training of U-Net models are identified and mitigation strategies to improve the speedups are discussed. Together, these results open up the possibility of combining both approaches into a unified scalable pipelined and data parallel algorithm to efficiently train U-Net models with very large receptive fields on data sets of ultra-large extent images.