- Cheng Tang, George Washington University, Washington, DC
A recurring pattern in many areas of machine learning is the empirical success of a handful of “heuristics” (i.e., any simple learning procedure favored by practitioners). Many of these heuristic techniques lack formal theoretical justification. For unsupervised learning, Lloyd’s k-means algorithm, while provably exponentially slow in the worst case, remains popular for clustering problems arising from different applications. For supervised learning, random forest is another example of a winning heuristic with many variants and applications. Perhaps the most prominent example is the blossoming field of deep learning, which is almost entirely composed of heuristics; the practical success of a deep learning algorithm usually relies on an experienced user skillfully and creatively combining heuristics. In this talk, I will discuss some of my thesis work in advancing the theoretical understanding of some of the most widely-used machine learning heuristics.