10Semi-Supervised Learning

Manish Devgan, Gaurav Malik and Deepak Kumar Sharma*

Department of Information Technology, Netaji Subhas University of Technology, New Delhi, India


Semi-supervised learning is a machine learning paradigm which combines both labeled and unlabeled data to increase the performance accuracy of the machine. Unlike the supervised and the unsupervised approaches [1] that rely solely on labeled and unlabeled data respectively, semi-supervised learning uses a collective set of labeled data and unlabeled data and tries to converge to an absolute perfection for predicting the data points. The motivation behind using both types of data is due to the readily available unlabeled data that exists in enormous amount, whereas labeled data is hard to find and is a very expensive task to label the unlabeled data. Semi-supervised learning emerged as an improvisation to the unavailability of labeled data for natural systems as well as a strong potential quantitative tool to model the substantial unlabeled data around. It starts with understanding the unlabeled data by the means of labeled data and then training the machine of the natural system [1, 2].

This chapter provides a great introduction for learners exploring the field of semi-supervised learning including self-training, generative models, and co-training along with Multiview learning algorithms, graph-based algorithms, and more. The discussion on generative models comprises of image classification, text ...

Get Machine Learning and Big Data now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.