Skip to Content
Keras to Kubernetes
book

Keras to Kubernetes

by Dattaraj Rao
May 2019
Intermediate to advanced content levelIntermediate to advanced
320 pages
8h 51m
English
Wiley
Content preview from Keras to Kubernetes

CHAPTER 3Handling Unstructured Data

In this chapter, we look in more detail at the differences between structured and unstructured data. This difference in type of data often drives the selection of certain classes of algorithms for ML. We see what makes unstructured data different and why it needs particular attention to handle it properly. We explore common types of unstructured data like images, videos, and text. We see which techniques and tools are available to analyze this data and extract knowledge from it. We see examples of converting structured data into features that can be used for training Machine Learning models.

Structured vs. Unstructured Data

As we saw in the previous chapter, the key to ML is providing good data that the model can learn patterns from and then make its own predictions on unseen data. We need to provide good clean data to the model in a way that it can learn from. Structured data is data in a state that can be easily consumed by a model. Here there is a fixed data structure to how you receive the data to feed to your model. Over time or over multiple data points, this structure does not change. Hence, you can map your features to this structure. Each data point can be thought of as a fixed size vector, with each dimension or row of the vector representing a feature.

Figure 3.1 shows two examples of structured data. The first is timeseries data obtained as sensor readings. Here you get the same vector data points over different intervals of time. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Automated Machine Learning on AWS

Automated Machine Learning on AWS

Trenton Potgieter
Elegant SciPy

Elegant SciPy

Juan Nunez-Iglesias, Stéfan van der Walt, Harriet Dashnow
O'Reilly Strata Data and AI Superstream

O'Reilly Strata Data and AI Superstream

O'Reilly Media, Inc., Shubhankar Jain, Jin Yang, Manasi Vartak, Chris Fregly, Liqun Shao, Kai Wahner, Dave Nielsen, Micah Wylde, Austin Bennett

Publisher Resources

ISBN: 9781119564836Purchase book