Chapter 22. Neural Networks for Unstructured Data
22.0 Introduction
In the previous chapter, we focused on neural network recipes for structured data, i.e., tabular data. Most of the largest advances in the past few years have actually involved using neural networks and deep learning for unstructured data, such as text or images. Working with these unstructured datasets is a bit different than working with structured sources of data.
Deep learning is particularly powerful in the unstructured data space, where “classic” machine learning techniques (such as boosted trees) typically fail to capture all the complexity and nuance present in text data, audio, images, videos, etc. In this chapter, we will explore using deep learning specifically for text and image data.
In a supervised learning space for text and images, there are many subtasks or “types” of learning. The following are a few examples (though this is not a comprehensive list):
-
Text or image classification (example: classifying whether or not an image is a picture of a hotdog)
-
Transfer learning (example: using a pretrained contextual model like BERT and fine-tuning it on a task to predict whether or not an email is spam)
-
Object detection (example: identifying and classifying specific objects within an image)
-
Generative models (example: models that generate text based on a given input such as the GPT models)
As deep learning has grown in popularity and become increasingly commoditized, both the open source and ...
Get Machine Learning with Python Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.