Chapter 8. Automated ML for Developers

Earlier, you learned how to use the automated ML tool in Azure Machine Learning with Jupyter Notebooks. In this chapter, you’ll learn how to use automated ML in other environments: Azure Databricks, ML.NET, and SQL Server.

Azure Databricks and Apache Spark

Azure Databricks is a fast, easy, and collaborative Apache Spark–based analytics platform. It is a managed Spark service in Azure and integrates with various Azure services. This means that Azure manages not only the Spark cluster nodes, but also the Spark application running on top of it. It has other helpful features, as follows:

  • Azure Databricks, with its goal of improving productivity for users, is designed to be scalable, secure, and easy to manage. It has a collaborative workspace, shared among users who have appropriate permissions. Users can share multiple notebooks, clusters, and libraries from within the workspace.

  • The Azure Databricks workspace is a single place where data engineers, data scientists, and business analysts can work with all of the required libraries. The data sources can be available in the same workspace as well.

  • In an Azure Databricks workspace, authentication and authorization is based on a user’s Azure Active Directory (Azure AD) login. Important from a governance perspective is that it’s easy to add or remove a user from the Azure Databricks workspace, and users can be given different permissions, as a reader, contributor, or owner. And it’s important ...

Get Practical Automated Machine Learning on Azure now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.