In this chapter, we will go over how to build a simple logistic regression model in both scikit-learn and PySpark. We will also go over the process of k-fold cross validation to tune a hyperparameter in scikit-learn.
Introduction
In the previous chapter, you loaded the credit card data set and analyzed the distribution of its data. You also looked at the relationships between the features and got a general idea of how heavily they influence the labels.
Now that you’ve gained a better understanding of the data set, you will proceed ...