Doing classification using logistic regression

In classification, the response variable y has discreet values as opposed to continuous values. Some examples are e-mail (spam/non-spam), transactions (safe/fraudulent), and so on.

The y variable can take two values, namely 0 or 1, as illustrated in the following equation:

Here, 0 is referred to as a negative class and 1 means a positive class. Though we are calling them positive or negative, it is only for convenience's sake. Algorithms are neutral about this assignment. Algorithms have no emotions, and 1 does not mean higher than or better than 0

Though linear regression works well with regression ...

Get Apache Spark 2.x Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.