Experiences in autotuning matrix multiplication for energy minimization on GPUs

by Hartwig Anzt, Blake Haugen, Jakub Kurzak, P Luszczek, Jack J Dongarra

Publication Type

Journal

Journal Name

Concurrency and Computation: Practice and Experience

Publication Date

December, 2015

Page Numbers

5096 to 5113

Volume

Issue

View DOI Listing

Abstract

In this paper, we report extensive results and analysis of autotuning the computationally intensive graphics processing units kernel for dense matrix-matrix multiplication in double precision. In contrast to traditional autotuning and/or optimization for runtime performance only, we also take the energy efficiency into account. For kernels achieving equal performance, we show significant differences in their energy balance. We also identify the memory throughput as the most influential metric that trades off performance and energy efficiency. As a result, the performance optimal case ends up not being the most efficient kernel in overall resource use. Copyright (c) 2015John Wiley & Sons, Ltd.

Experiences in autotuning matrix multiplication for energy minimization on GPUs

Abstract

Researchers

Organizations