Chapter 7. Getting smart with MLlib

This chapter covers

  • Machine-learning basics
  • Performing linear algebra in Spark
  • Scaling and normalizing features
  • Training and applying a linear regression model
  • Evaluating the model’s performance
  • Using regularization
  • Optimizing linear regression

Machine learning is a scientific discipline that studies the use and development of algorithms that make computers accomplish complicated tasks without explicitly programming them. That is, the algorithms eventually learn how they can solve a given task. These algorithms include methods and techniques from statistics, probability, and information theory.

Today, machine learning is ubiquitous. Examples include online stores that offer you similar items that ...

Get Spark in Action now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.