High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function

Show authors

Publication Type

Conference Paper

Book Title

2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)

Publication Date

November, 2021

Page Numbers

46 to 57

Publisher Location

United States of America

Conference Name

7th workshop on Machine Learning in High Performance Computing Environments (MLHPC)

Conference Location

St. Loius, Missouri, United States of America

Conference Sponsor

IEEE

Conference Date

Nov 15, 2021

View DOI Listing

Abstract

Computational biology is one of many scientific disciplines ripe for innovation and acceleration with the advent of high-performance computing (HPC). In recent years, the field of machine learning has also seen significant benefits from adopting HPC practices. In this work, we present a novel HPC pipeline that incorporates various machine-learning approaches for structure-based functional annotation of proteins on the scale of whole genomes. Our pipeline makes extensive use of deep learning and provides computational insights into best practices for training advanced deep-learning models for high-throughput data such as proteomics data. We showcase methodologies our pipeline currently supports and detail future tasks for our pipeline to envelop, including large-scale sequence comparison using SAdLSA and prediction of protein tertiary structures using AlphaFold2.

High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function

Abstract

Researchers

Organizations