© Leila Etaati 2019
Leila EtaatiMachine Learning with Microsoft Technologieshttps://doi.org/10.1007/978-1-4842-3658-1_15

15. Machine Learning on HDInsight

Leila Etaati1 
Aukland, Auckland, New Zealand

In this chapter, an overview of how to use HDInsight for the purpose of machine learning will be presented. HDInsight is based on Apache Spark and used for in-memory cluster processing. Processing data in-memory is much faster than disk-based computing. Spark also supports the Scala language, which supports distributed data sets. Creating a cluster in Spark is very fast, and it is able to use Jupyter Notebook, which makes data processing and visualization easier. Spark clusters can also be integrated with Azure Event Hub and Kafka. Moreover, it is ...

Get Machine Learning with Microsoft Technologies: Selecting the Right Architecture and Tools for Your Project now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.