Evolving computation to better understand biology: The vision we are working toward is to be able to research biological systems using computational approaches that investigate, analyze and model observed phenomena more holistically and precisely, and to improve:
- the predictive capabilities of the computational models toward guided discovery and experimental design
- experimental methods and approaches that maximize scientific benefit for the experimental cost in dollars and or time,
- and thereby our scientific understanding of how biological systems work and could be changed in the context of their environment, or change their environment.
To achieve this requires some important changes both in how data is experimentally generated and managed, and in the computational methods and processes, especially going from working on one-step-at-a-time workflows to many parallel ones. Biological research will increasingly be driven by evolving computational analysis and prediction.
My career has been devoted to computational modeling and developing tools and systems for biological research. Starting as a programmer on genetic linkage analysis software working for Jurg Ott on LIPED. this was the first program that modeled genetics in complex pedigrees and was used to locate Huntington’s disease in 1983; the first autosomal human disease gene mapped using DNA polymorphisms. This led to my work as Directeur Informatique at CEPH (Centre d'Etude du Polymorphisme Humain) that established an international consortium to create the first genetic maps of the human genome. My role was to oversee the data collection, management and analysis. I also contributed to some of the disease mapping projects. As data sets grew, computational time for increasingly refined linkage maps was growing exponentially becoming a significant impediment. To offset this, algorithmic modifications were made by me and collaborators in a rewriting of the linkage programs to create FASTLINK that eventually also included data error modeling and parallelization to yield vastly improved run times and predictive power. GDB, the Human Genome Database, was a multi-agency international project for the integration of human mapping data. There I lead software engineering, project management and community outreach efforts especially to the Human Chromosome Mapping Workshops. From 1997-2008 I worked with several biotech companies from leading small bioinformatics to large research computing groups in all cases relating genes and function to drug or treatment targets. Since then I have been at ORNL leading the Bioinformatics group where the majority of my effort has been related to KBase, first in leading the effort to establish the need and vision, and then in the implementation. KBase is not only a technological revolution for biological research but also a sociological one. My background on large software and database projects in both academic and commercial settings lends experience to this project that has helped in the transition to a more user focused production operation together with the other PIs, Adam Arkin at LBNL, and Chris Henry at ANL.
Google Scholar: https://scholar.google.com/citations?user=1PWI1ssAAAAJ&hl=en
DOE Systems Biology Knowledgebase (KBase): https://www.kbase.us/