Researchers led by the University of Melbourne, Australia, have been nominated for the Association for Computing Machinery’s 2024 Gordon Bell Prize in supercomputing for conducting a quantum molecular dynamics simulation 1,000 times greater in size and speed than any previous simulation of its kind.
The team also includes researchers from AMD, QDX and the Department of Energy’s Oak Ridge National Laboratory. Using Frontier, the world’s most powerful supercomputer, the team calculated a system containing more than 2 million correlated electrons. Winners of the Gordon Bell Prize will be announced at the 2024 Supercomputing Conference in Atlanta, Georgia, Nov. 17 to 22.
“It’s an honor to work with such an amazing team of experts and be recognized for this achievement,” said lead researcher Giuseppe Barca, an associate professor and programmer at the University of Melbourne. “We intend to continue building on this new level of sophistication and keep pushing the limits of scientific computing.”
The time resolved quantum chemistry calculations are the first to exceed an exaflop — more than a quintillion calculations per second — using double-precision arithmetic. The 16 decimal places provided by double precision is computationally demanding, but the additional precision is required for many scientific problems. In addition to setting a new benchmark, the achievement also provides a blueprint for enhancing algorithms to tackle larger, more complex scientific problems using leadership-class exascale supercomputers.
“This is a game changer for many areas of science, but especially for drug discovery using high-accuracy quantum mechanics,” said Barca at the time of the simulation. “Historically, researchers have hit a wall trying to simulate the physics of molecular systems with highly accurate models because there just wasn’t enough computing power,” he added. “So, they have been confined to only simulating small molecules, but many of the interesting problems that we want to solve involve large models.”
To overcome these limitations, Barca and his team developed EXESS, or the Extreme-scale Electronic Structure System. Instead of going through the painstaking process of scaling up legacy codes that were written for previous generation petascale machines, Barca decided to write a new code specifically designed for exascale systems like Frontier with hybrid architectures — a combination of central processing units, or CPUs, and graphics processing units, or GPUs.
“Being able to accurately predict the behavior and model the properties of atoms either in larger molecular systems or with more fidelity is fundamentally important for developing new, more advanced technologies, including improved drug therapeutics, medical materials and biofuels,” said Dmytro Bykov, group leader in computing for chemistry and materials at ORNL. “This is why we built Frontier, so we can push the limits of computing and do what hasn’t been done.”
Pushing way past the petaflop
The HPE-Cray EX Frontier supercomputer, located at the Oak Ridge Leadership Computing Facility, or OLCF, is currently ranked No. 1 on the TOP500 list of the world’s fastest supercomputers after achieving a maximum performance of 1.2 exaflops. Frontier has 9,408 nodes with more than 8 million processing cores from a combination of AMD 3rd Gen EPYC™ CPUs and AMD Instinct™ MI250X GPUs.
The team’s efforts on Frontier were a huge success. They ran a series of simulations that utilized 9,400 Frontier computing nodes to calculate the electronic structure of different proteins and organic molecules containing hundreds of thousands of atoms.
The average run times of the simulations ranged from minutes to several hours. The new algorithm enabled the team to simulate atomic interactions in time steps — essentially snapshots of the system — with significantly improved latency compared to previous methods. For example, time steps for protein systems with thousands of electrons can now be completed in as little as 1 to 5 seconds.
“Two of the biggest challenges in this achievement were designing an algorithm that could push Frontier to its limits and ensuring the algorithm would run on a system that has more than 37,000 GPUs,” Bykov said. “The solution meant using more computing components, and any time you add more, it also means there’s a greater chance that one of those parts is going to break at some point. The fact that we used the entire system is incredible, and it was remarkably efficient.”
“I cannot describe how difficult it was to achieve this scale both from a molecular and a computational perspective,” Barca said. “But it would have been meaningless to do these calculations using anything less than double precision. So, it was either going to be all or nothing.”
Support for this research came from the DOE Office of Science’s Advanced Scientific Computing Research program. The OLCF is a DOE Office of Science user facility.
UT-Battelle manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit energy.gov/science.
Read the full story, Game-changing quantum chemistry calculations push new boundaries of exascale Frontier