Skip to main content

DFT-FE 1.0: A massively parallel hybrid CPU-GPU density functional theory code using finite-element discretization...

by Sambit Das, Phani Motamarri, Vishal Subramanian, David M Rogers, Vikram Gavini
Publication Type
Journal Name
Computer Physics Communications
Publication Date
Page Numbers
108473 to 108473

We present DFT-FE 1.0, building on DFT-FE 0.6 [Comput. Phys. Commun. 246, 106853 (2020)], to conduct fast and accurate large-scale density functional theory (DFT) calculations (reaching ∼ 100, 000 electrons) on both many-core CPU and hybrid CPU-GPU computing architectures. This work involves improvements in the real-space formulation—via an improved treatment of the electrostatic interactions that substantially enhances the computational efficiency—as well high-performance computing aspects, including the GPU acceleration of all the key compute kernels in DFT-FE. We demonstrate the accuracy by comparing the ground-state energies, ionic forces and cell stresses on a wide-range of benchmark systems against those obtained from widely used DFT codes. Further, we demonstrate the numerical efficiency of our implementation, which yields ∼ 20× CPU-GPU speed-up by using GPU acceleration on hybrid CPU-GPU nodes. Notably, owing to the parallel-scaling of the GPU implementation, we obtain wall-times of 80−140 seconds for full ground-state calculations, with stringent accuracy, on benchmark systems containing ∼ 6, 000 − 15, 000 electrons.