November 2019
Intermediate to advanced
346 pages
9h 36m
English
The first step involves simply loading the necessary libraries that will allow us to manipulate data quickly and easily. In steps 2 and 3, we generate a training and testing set consisting of normal observations. These have the same distributions. In step 4, on the other hand, we generate the remainder of our testing set by creating outliers. This anomalous dataset has a different distribution from the training data and the rest of the testing data. Plotting our data, we see that some outlier points look indistinguishable from normal points (step 5). This guarantees that our classifier will have a significant percentage of misclassifications, due to the nature of the data, and we must keep this in mind when evaluating its ...