Skip to main content

Ramifications of Evolving Misbehaving Convolutional Neural Network Kernel and Batch Sizes...

by Mark A Coletti, Wadzanai D Lunga, Anne S Berres, Jibonananda Sanyal, Amy N Rose
Publication Type
Conference Paper
Journal Name
IEEE/ACM Machine Learning in HPC Environments (MLHPC)
Publication Date
Page Numbers
106 to 113
Conference Name
Machine Learning in HPC Environments (MLHPC 2018)
Conference Location
Dallas, Texas, United States of America
Conference Sponsor
The International Conference for High Performance Computing, Networking, Storage and Analysis
Conference Date

Deep-learners have many hyper-parameters including learning rate, batch size, kernel size - all playing a significant role toward estimating high quality models. Discovering useful hyper-parameter guidelines is an active area of research, though the state of the art generally uses a brute force, uniform grid approach or random search for finding ideal settings. We share the preliminary results of using an alternative approach to deep learner hyper-parameter tuning that uses an evolutionary algorithm to improve the accuracy of a deep-learner models used in satellite imagery building footprint detection. We found that the kernel and batch size hyper-parameters surprisingly differed from sizes arrived at via a brute force uniform grid approach. These differences suggest a novel role for evolutionary algorithms in determining the number of convolution layers, as well as smaller batch sizes in improving deep-learner models.