Abstract
High temperature oxidation and corrosion degradation mechanisms dictate the lifetime of materials critical to energy production. The combination of modeling and experimental approaches such as machine learning (ML) and data analytics, with sufficient experimental data, can accelerate the development of new materials while limiting its cost. In the present work, ML will be applied to two high temperature oxidation data libraries (Oak Ridge National Laboratory and National Air and Space Administration) that comprised of about 5000 mass change sample datasheets for a variety of materials and temperatures in dry air and air + 10 % H2O. A python code was developed to prepare the data for machine learning by collecting and formatting oxidation rate constants, alloy compositions and environment of exposure into a single data frame. Scikit-learn library and Statistics and Machine Learning Toolbox within MathWorks were then used to perform unsupervised clustering and supervised regression learning. The impact of dataset distribution on the performance of the developed ML models was evaluated. Potential strategies to improve the predictions and enhance extrapolative capability of the previously trained model were investigated.