Error-Bounded Graph Construction for Semi-supervised Manifold Learning

by Christopher T Symons

Publication Type

Conference Paper

Journal Name

ACM Digital Library

Publication Date

August, 2018

Page Number

Volume

TBD

Conference Name

14th International Workshop on Mining and Learning with Graphs (held in conjunction with KDD'18)

Conference Location

London, United Kingdom

Conference Sponsor

ACM SIGKDD

Conference Date

Aug 18, 2018 - Aug 23, 2018

Abstract

Graphs are commonly used in semi-supervised learning to represent a manifold on which the data reside in a high-dimensional ambient space. The graph can then be utilized in different ways, typically via the Laplacian of the graph, in order to leverage associations among the unlabeled data to improve learning. One common way to leverage the graph Laplacian is as a regularization term, where models that would disagree with the graph are penalized. More often the spectrum of the graph Laplacian is used to find a lower dimensional embedding in which neighboring relations encoded via the graph are preserved. Most manifold-based methods of semi-supervised learning depend upon geometric structure in the ambient feature space in order to construct a graph whose edges encode similarity that should be useful in selecting a model. A critical assumption is that some standard measure of similarity applied to the ambient space can be used to construct a graph that is error-free or of low error, meaning that examples (i.e., vertices) from distinct classes are not connected. However, this assumption often precludes the use of such methods in noisy or complex feature spaces, even though such spaces often arise in problems that can most benefit from structure that might be uncovered within the unlabeled data. This paper presents a method of graph construction for manifold-based semi-supervised learning that respects the manifold assumptions underlying these methods and bounds the error on the graph itself, which then permits bounds on the overall generalization error of the learning algorithms without relying on assumptions that do not hold in many modern problem domains.

Error-Bounded Graph Construction for Semi-supervised Manifold Learning

Abstract

Organizations