This section explains how we use the evaluation calculations to determine the accuracy of our model.
- A confusion matrix is helpful to quickly summarize the accuracy numbers between actual results and predicted results. Since we had a 75:25 split, we should see 25 predictions from our training dataset. We can build a build a confusion matric using the following script: predictionDF.crosstab('label', 'prediction').show(). The output of the script can be seen in the following screenshot:
- We are now at the stage of evaluating the accuracy of the model by comparing the prediction values against the actual label values. sklearn.metrics ...