O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hypothesis

X denotes the input variables, also called input features, and y denotes the output or target variable that we are trying to predict. The pair (x, y) is called a training example, and the dataset used to learn is a list of m training examples, where {(x, y)} is a training set. We will also use X to denote the space of input values, and Y to denote the space of output values. For a training set, to learn a function, h: X → Y so that h(x) is a predictor for the value of y. Function h is called a hypothesis.

When the target variable to be predicted is continuous, we call the learning problem a regression problem. When y can take a small number of discrete values, we call it a classification problem.

Let's say we choose to approximate ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required