Abstract
A critical step in structure-based drug discovery is predicting whether and how a candidate molecule binds to a model of a therapeutic target. However, substantial protein side chain movements prevent current screening methods, such as docking, from accurately predicting the ligand conformations and require expensive refinements to produce viable candidates. We present the development of a high-throughput and flexible ligand pose refinement workflow, called “tinyIFD”. The main features of the workflow include the use of specialized high-throughput, small-system MD simulation code mdgx.cuda and an actively learning model zoo approach. We show the application of this workflow on a large test set of diverse protein targets, achieving 66% and 76% success rates for finding a crystal-like pose within the top-2 and top-5 poses, respectively. We also applied this workflow to the SARS-CoV-2 main protease (Mpro) inhibitors, where we demonstrate the benefit of the active learning aspect in this workflow.