O'Reilly logo

Python Machine Learning By Example - Second Edition by Yuxi Liu

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Classifying newsgroup topics with SVMs

Finally, it is time to build our state-of-the-art SVM-based newsgroup topic classifier using everything we just learned.

First we load and clean the dataset with the entire 20 groups as follows:

>>> categories = None>>> data_train = fetch_20newsgroups(subset='train',                          categories=categories, random_state=42)>>> data_test = fetch_20newsgroups(subset='test',                          categories=categories, random_state=42)>>> cleaned_train = clean_text(data_train.data)>>> label_train = data_train.target>>> cleaned_test = clean_text(data_test.data)>>> label_test = data_test.target>>> term_docs_train = tfidf_vectorizer.fit_transform(cleaned_train)>>> term_docs_test = tfidf_vectorizer.transform(cleaned_test)

As we have seen that the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required