Chapter 7. Getting smart with MLlib

This chapter covers

  • Machine-learning basics
  • Performing linear algebra in Spark
  • Scaling and normalizing features
  • Training and applying a linear regression model
  • Evaluating the model’s performance
  • Using regularization
  • Optimizing linear regression

Machine learning is a scientific discipline that studies the use and development of algorithms that make computers accomplish complicated tasks without explicitly programming them. That is, the algorithms eventually learn how they can solve a given task. These algorithms include methods and techniques from statistics, probability, and information theory.

Today, machine learning is ubiquitous. Examples include online stores that offer you similar items that ...

Get Spark in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.