Abstract
The machine learning methods for classifiers to detect low level radiation sources are of interest when suitable training data sets are available. Their application and performance assessment, however, involves the aspects of over-fitting and training data selection that are somewhat uncommon in other existing methods for this task.
We study U-235 gamma signatures using data sets collected by 21 NaI detectors under controlled conditions. The gamma spectra are collected by the detectors located at different distances from the source, and we study their choice as training sets for classifiers to detect a source. The detectors form the near, middle and outer groups based on the distance to source. The classifiers based on the outer group are susceptible to over-fitting, that is, they achieve low training error but incur much higher testing error in independent tests. The other two groups achieve lower training error and comparable testing error, and the near group achieves the overall lowest error. In detecting a source at an unknown distance, the farther detectors in the middle group achieve the overall lowest testing error with limited over-fitting, thereby indicating the complex dependencies between the training and classifier performance.