1Semi-supervised Learning Based on Distributionally Robust Optimization

We propose a novel method for semi-supervised learning (SSL) based on data-driven distributionally robust optimization (DRO) using optimal transport metrics. Our proposed method enhances generalization error by using the non-labeled data to restrict the support of the worst case distribution in our DRO formulation. We enable the implementation of our DRO formulation by proposing a stochastic gradient descent algorithm, which allows us to easily implement the training procedure. We demonstrate that our semi-supervised DRO method is able to improve the generalization error over natural supervised procedures and state-of-the-art SSL estimators. Finally, we include a discussion on the large sample behavior of the optimal uncertainty region in the DRO formulation. Our discussion exposes important aspects such as the role of dimension reduction in SSL.

1.1. Introduction

We propose a novel method for semi-supervised learning (SSL) based on data-driven distributionally robust optimization (DRO) using an optimal transport metric – also known as Earth’s moving distance (see [RUB 00]).

Our approach enhances generalization error by using the unlabeled data to restrict the support of the models, which lie in the region of distributional uncertainty. It is intuitively felt that our mechanism for fitting the underlying model is automatically tuned to generalize beyond the training set, but only over potential instances ...

Get Data Analysis and Applications 3, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.