Binary classification refers to problems with only two distinct classes. As we did in the previous chapter, we will generate a dataset using the convenience function, make_classification(), in the SciKit Learn library:
X, y = skds.make_classification(n_samples=200, n_features=2, n_informative=2, n_redundant=0, n_repeated=0, n_classes=2, n_clusters_per_class=1)if (y.ndim == 1): y = y.reshape(-1,1)
The arguments to make_classification() are self-explanatory; n_samples is the number of data points to generate, n_features is the number of features to be generated, and n_classes is the number of classes, which is 2:
- n_samples is the number of data points to generate. We have kept it to 200 to keep the dataset small.