With the rise of the global pandemic, Omar Demerdash, a Liane B. Russell Distinguished Staff Fellow at ORNL since 2018, has become laser-focused on potential avenues to COVID-19 therapies.
Omar is working in the Biosciences Division. His mentor is division director Julie Mitchell.
Omar applies his expertise in machine learning–based methods to predict how strongly a molecule binds to a targeted protein. The knowledge gained from this basic research could shed light on molecules that may be tested further in the laboratory to determine whether they are potential drugs.
A disease may be associated with an abnormal function of one or more proteins. A drug is a type of molecule that modulates a protein’s function. Proteins are molecules, too, but due to their size, they’re referred to as macromolecules.
Computational science: Protein–drug sleuth
Machine learning relies on training models on “ground truth” data sets typically consisting of examples that have been tested in the lab. In Omar’s research, an example of this is whether a drug is effective or not effective. Machine learning combined with physics-based methods allow Omar to answer additional questions such as whether the protein–molecule binding modulates the disease protein’s function, perhaps by turning it on, making it more active, or turning it off, making it less active.
“I’ve developed models that rank drugs in terms of how strongly they will bind to a protein, with the simple underlying hypothesis that the more strongly a molecule binds to a protein, the better a candidate it is for further testing in the lab, potentially becoming a drug in the end,” he said.
Omar sifts through large libraries of databases of drugs or small molecules, formulating predictions of what would make the most likely candidate drugs for disease-associated proteins.
Making predictions is harder in the biological sciences than other research areas, Omar points out, because there are “more moving parts and many different players.” In addition, proteins can be very large, making calculations difficult.
“Let’s say we want to simulate a whole cell with all its proteins,” he said. “If we modeled it at scale with all its individual atoms, the calculations would take an unreasonably long amount of time, even if all the computing power of Summit were utilized.”
COVID-19 focus: Omar’s work goes viral
The SARS-COV-2 virus, Omar adds, contains a very large protein—the spike protein we’ve all seen depicted as protruding red spikes—that comprises the principal route of entry into human cells. “Only parts of the protein’s structure have been resolved and visualized experimentally, so size is a challenge,” he said.
Omar also points out that viruses like the SARS-COV-2 virus that causes COVID-19 have numerous proteins that can be blocked, and different ways their proteins could be blocked.
For example, the spike protein could be blocked from binding to its putative receptor on the human cell, or the virus could be blocked at the point where it’s trying to synthesize its RNA, where the virus’ genetic material is encoded. COVID-19’s proteases could also be blocked, “throwing a monkey wrench in the virus’ replication machine,” Omar added.
Originally brought on to apply machine learning–based computational methods to predict which drugs are most promising for study as targeted protein modulators for any human disease, including cancer, Omar’s research scope has expanded to include plant–microbe interactions, with implications for understanding plant survival amid climate change and for maintaining plants as biofuel crops. He also has begun exploring the usefulness of neutron scattering data for improving modeling of proteins implicated in diseases such as Alzheimer’s and Parkinson’s.
“Currently, as it stands, biology is not a predictive science because of a host of reasons, including knowledge gaps regarding the relevant macromolecules and their putative functions, as well as limitations in computational models at the molecular, systems, and bioinformatics levels,” Omar said. “What I want to do is be able to improve the predictive power at the molecular and atomistic levels by leveraging the synergism of physics-based models, experimental data, and machine learning.”
Although Omar likes to say that any problem in chemistry or biology ultimately reduces to physics, there are inherent approximations in physics approaches that lead to decreased accuracy. “We often have to augment the physics with machine learning,” he pointed out. “But this allows us to make use of all the data that biologists are generating, and the data need to be well-curated and stored in a database such that the physical or computational scientist can download it and develop predictive models.”
Deep learning techniques effective with 3D protein structures
According to Omar, applying deep learning to drug discovery is a powerful method. This is because proteins are 3D structures and can be framed as images. “Proteins are made up of a sequence of amino acids, and that linear sequence folds into a 3D structure,” he said. “The dynamical interactions of the parts can be framed as images, making them amenable to deep neural networks.”
Deep learning and associated methods work by making predictions based on learning patterns in training data. So, in the case of Omar’s COVID-19 research, he is looking for results that indicate strong versus weak protein binding of potential drug molecules—and how those molecules might modulate the protein’s function—to pinpoint the best candidate molecules for further experimentation in a laboratory.
Omar’s COVID-19–focused research is part of a large group effort begun at the Center for Molecular Biophysics, a collaboration between ORNL and the University of Tennessee–Knoxville that conducts research at the interface of biological, environmental, physical, computational and neutron sciences.
Since Jeremy Smith, a UT Governor’s Chair and CMB director, and Micholas Smith simulated the spike protein bound to the ACE-2 receptor, the project has expanded to simulate all COVID-19 proteins with molecular dynamics simulations. Omar runs his models on protein–ligand complex structures generated with computational docking to screen those potential drugs. ORNL researchers collaborate with Arvind Ramanathan at Argonne National Laboratory, who develops machine learning methods to make predictions from molecular dynamics simulations, and with Colleen Johnson at UT–Memphis to validate predictions in the laboratory.
When not working on new drug therapies or plant–microbe dynamics, Omar likes to work out and has composed music on keyboard. “If it had to go in a genre, I guess my music would wind up as new age or ambient,” he said. Listen to a sample here.
Omar performed at open mic nights and small music festival while he was a postdoctoral scholar at University of California–Berkeley in the mid-2000s.
He says he likes many aspects of living in East Tennessee. “Having spent most of my life in Wisconsin, I really like the weather, not going to have to put on a jacket in the summer, and having so little snow,” Omar shared. He also cites the friendliness and down-to-earth nature of people here and “the best gym I’ve had to work out in after many years.”
UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.
See ORNL’s main COVID-19 news site for more information on the laboratory’s fight against the novel coronavirus.