3. Developing a Text Classifier

Overview

This chapter starts with an introduction to the various types of machine learning methods, that is, the supervised and unsupervised methods. You will learn about hierarchical clustering and k-means clustering and implement them using various datasets. Next, you will explore tree-based methods such as random forest and XGBoost. Finally, you will implement an end-to-end text classifier in order to categorize text on the basis of its content.

Introduction

In the previous chapters, you learned about various extraction methods, such as tokenization, stemming, lemmatization, and stop-word removal, which are used to extract features from unstructured text. We also discussed Bag of Words and Term Frequency-Inverse ...

Get The Natural Language Processing Workshop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.