Implementing SVM

We have largely covered the fundamentals of the SVM classifier. Now, let's apply it right away to newsgroup topic classification. We start with a binary case classifying two topics – comp.graphics and sci.space:

Let's take a look at the following steps:

  1. First, we load the training and testing subset of the computer graphics and science space newsgroup data respectively:
>>> from sklearn.datasets import fetch_20newsgroups>>> categories = ['comp.graphics', 'sci.space']>>> data_train = fetch_20newsgroups(subset='train',                          categories=categories, random_state=42)>>> data_test = fetch_20newsgroups(subset='test',                          categories=categories, random_state=42)
Don't forget to specify a random state in order to reproduce experiments.

Get Python Machine Learning By Example - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.