Dan Jacobson is illuminating the workings of biological systems from the molecular scale up by leveraging Oak Ridge National Laboratory’s supercomputing resources to create machine- and deep-learning techniques more easily understood by humans—an evolving field called explainable artificial intelligence (AI).
Understanding how complex interactions between genes and proteins in cells influence an organism’s traits and behaviors is what drives Jacobson’s work as a computational systems biologist. The researcher and his team integrate big data, supercomputing, mathematics, statistics, and biology to study questions in bioenergy, microbial systems, neuroscience, and precision medicine.
Jacobson’s work in bioenergy builds on a long-standing research initiative focused on Populus, a fast-growing perennial tree that shows promise as a low-cost, renewable feedstock for bioenergy. A large team of researchers collected data on 28 million genetic variations in Populus as part of the BioEnergy Science Center, with the work now continuing at the Center for Bioenergy Innovation at ORNL.
With this wealth of data and a grant from the Department of Energy’s Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program, Jacobson and collaborators are creating an unprecedented view of the 3D interactions among components of the cellular machinery in poplar.
“We’re taking these very large and disparate data sets and integrating them into a whole that nobody really projected,” said Jacobson.
Predicting how genes, proteins, and small molecules interact inside a cell requires the development of new approaches to mining big data with high-performance computing. Using explainable AI, researchers are starting to discover the high-order interactions that their algorithms capture in order to make the results understandable by humans.
Jacobson is using explainable AI to discover molecular interactions in biological systems that lead to the emergent properties of the organism (physical characteristics, diseases, etc.), but he notes that the method can be adapted and applied to other projects to accelerate scientific discovery.
“No one complains when you poke a tree,” said Jacobson. “You can learn a lot developing algorithms for a better understanding of plants for bioenergy. We can pivot and apply that knowledge toward projects that are going to positively impact human health. The algorithm doesn’t care about the species.”
Big data, big science
Examining how genetics affects health is the aim of a collaborative project Jacobson is contributing to with support from the U.S. Department of Veterans Affairs and DOE. The project combines genetic, clinical and lifestyle data with the ultimate goal of predicting and diagnosing diseases and tailoring treatments for individuals.
As one of the largest clinical genomics projects in the world, the initiative challenges scientists to use high performance computing and millions of veterans’ health data to understand the complex genetic underpinnings that affect medical disorders, drug interactions, drug specificity, and individuals’ responses to pharmaceuticals.
Jacobson’s interest in the veterans project is both professional and personal as his grandfather, father, brother, and nephew have served in the U.S. armed forces. He has a similar passion for investigating Alzheimer’s and other neurological disorders.
The human brain is the ultimate complex system, said Jacobson, who took graduate courses in neuroanatomy and neurochemistry as an undergraduate. “It just lit me up,” he said. “And so now, it is always on my radar.”
Some of the data sets Jacobson and his team are currently working with came from the Johns Hopkins School of Medicine, where he received his master of science degree in biochemistry and worked as an assistant professor early in his career. In fact, Jacobson pulls data from a wide variety of sources around the globe. Collaborative networks sharing data are essential to modern, large-scale biology. Jacobson has recently been selected to lead an Early Science project on Summit, ORNL’s new supercomputer, focused on human systems biology and drug discovery.
It was the big data housed at ORNL combined with some of the fastest supercomputers in the world that drew Jacobson to return home to Oak Ridge where he spent his childhood.
From vineyards to bioenergy feedstocks
Jacobson grew up in a scientific household with his father, Bruce Jacobson, a biochemical geneticist at ORNL, as his role model. The pair had the opportunity to collaborate on a couple of projects together during the early days of the Human Genome Project.
The collaboration took place during Jacobson’s first job at the laboratory as an intern before grad school. He worked as a biologist in a physics group, exploring different ways to image and tag biological molecules, after earning his bachelor of science degree in biochemistry at Florida State University.
When Jacobson and his father jointly presented their research at a conference, they often flew from different parts of the world to do so.
Jacobson journeyed to South Africa to help establish national programs for bioinformatics and supercomputing. While there, he also secured his doctoral degree in computational biology as applied to wine biotechnology.
As the leader of a computational biology group at the Institute for Wine Biotechnology at Stellenbosch University, Jacobson studied, among other things, the grapevine phytobiome, including the plants, soil, environment, and associated microbial communities. He and his colleagues made fundamental discoveries that helped explain variations in wines made from the same cultivar grown in contiguous vineyards but using different farming practices (traditional, organic and biodynamic).
“It was a really convenient excuse to do systems biology out in the field,” said Jacobson, who also appreciates fine wines. “It is one of the few—if any—scientific environments where you can drink the results of some of your experiments.”
It turns out that research into wines and grapevines has many parallels in bioenergy applications. In both cases, scientists are looking at similar questions about the growth of biomass and fermentation processes that follow. This made the transition to bioenergy research at ORNL an easy one for Jacobson and several students from South Africa who made the move with him in 2014. This background has also enabled Jacobson and his students to make significant contributions to the ORNL Plant Microbe Interfaces Project that examines the fundamental relationships among plants and microbes to facilitate an understanding of sustainable systems and the use of renewable feedstocks for bioenergy.
Through all the work environments he has experienced and enjoyed—from academia to spin-off companies to non-governmental organizations—Jacobson had been subconsciously missing the interdisciplinary nature of a national lab, he admits.
“National labs think at larger scales,” said Jacobson. “My group and I are driven by big data, big science, and exciting opportunities. We’ve felt right at home here.”
The Oak Ridge Leadership Computing Facility, home of Summit, is a DOE Office of Science User Facility. ORNL is managed by UT-Battelle for the Department of Energy's Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit http://energy.gov/science.