We now implement an S3VM in Python using the SciPy optimization methods, which are mainly based on C and FORTRAN implementations. The reader can try it with other libraries such as NLOpt and LIBSVM and compare the results. A possibility suggested by Bennet and Demiriz is to use the L1-norm for w, so as to linearize the objective function; however, this choice seems to produce good results only for small datasets. We are going to keep the original formulation based on the L2-norm, using an Sequential Least Squares Programming (SLSQP) algorithm to optimize the objective.
Let's start by creating a bidimensional dataset with both labeled and unlabeled samples:
from sklearn.datasets import make_classificationnb_samples = 500