*X* denotes the input variables, also called input features, and *y* denotes the output or target variable that we are trying to predict. The pair *(x, y)* is called a training example, and the dataset used to learn is a list of *m* training examples, where *{(x, y)}* is a training set. We will also use *X* to denote the space of input values, and *Y* to denote the space of output values. For a training set, to learn a function, *h: X → Y* so that *h(x)* is a predictor for the value of *y*. Function *h* is called a **hypothesis**.

When the target variable to be predicted is continuous, we call the learning problem a regression problem. When *y* can take a small number of discrete values, we call it a classification problem.

Let's say we choose to approximate ...