Chapter 43. In Depth: Support Vector Machines
Support vector machines (SVMs) are a particularly powerful and flexible class of supervised algorithms for both classification and regression. In this chapter, we will explore the intuition behind SVMs and their use in classification problems.
We begin with the standard imports:
In[1]:%matplotlibinlineimportnumpyasnpimportmatplotlib.pyplotaspltplt.style.use('seaborn-whitegrid')fromscipyimportstats
Note
Full-size, full-color figures are available in the supplemental materials on GitHub.
Motivating Support Vector Machines
As part of our discussion of Bayesian classification (see Chapter 41), we learned about a simple kind of model that describes the distribution of each underlying class, and experimented with using it to probabilistically determine labels for new points. That was an example of generative classification; here we will consider instead discriminative classification. That is, rather than modeling each class, we will simply find a line or curve (in two dimensions) or manifold (in multiple dimensions) that divides the classes from each other.
As an example of this, consider the simple case of a classification task in which the two classes of points are well separated (see Figure 43-1).
In[2]:fromsklearn.datasetsimportmake_blobsX,y=make_blobs(n_samples=50,centers=2,random_state=0,cluster_std=0.60)plt.scatter(X[:,0],X[:,1],c=y,s=50,cmap='autumn');