Chapter 3. Getting Started with Microsoft Azure Machine Learning and Automated ML

In Chapter 2, we explained the concept of Automated Machine Learning and provided a brief overview of the automated ML tool on Microsoft Azure Machine Learning. In this chapter, we look at how to get started with Azure Machine Learning and, subsequently, automated ML. Before going into details of automated ML, we’ll first discuss some of the common challenges that enterprises face in their machine learning projects, to better understand these issues.

The Machine Learning Process

When solving problems with machine learning, we begin with a problem statement in terms of what we are trying to optimize. Next, we look for a dataset that will help us solve the problem. We begin looking at the data and use a data manipulation library like Pandas. We look at missing values, distribution of data, and errors in the data. We try to join multiple datasets. When we think we have a good enough dataset to get underway, we split it into train, test, and validation datasets, typically in a ratio of 70:20:10. This helps avoid overfitting, which basically means we’re not using the same dataset for training and testing. We use the train dataset to train the machine learning algorithm. The test dataset is used for testing the machine learning model after training is complete, to ascertain how well the algorithm performed.

We establish a metric to determine algorithm performance and keep iterating until we get a good ...

Get Practical Automated Machine Learning on Azure now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.