We have largely covered the fundamentals of the SVM classifier. Now, let's apply it right away to newsgroup topic classification. We start with a binary case classifying two topics – comp.graphics and sci.space:
Let's take a look at the following steps:
- First, we load the training and testing subset of the computer graphics and science space newsgroup data respectively:
>>> from sklearn.datasets import fetch_20newsgroups>>> categories = ['comp.graphics', 'sci.space']>>> data_train = fetch_20newsgroups(subset='train', categories=categories, random_state=42)>>> data_test = fetch_20newsgroups(subset='test', categories=categories, random_state=42)