Skip to Main Content
The Natural Language Processing Workshop
book

The Natural Language Processing Workshop

by Rohan Chopra, Aniruddha M. Godbole, Nipun Sadvilkar, Muzaffar Bashir Shah, Sohom Ghosh, Dwight Gunning, Ankit Bhatia, Nagendra Nagaraj, John Bura, Sumit Kumar Raj, Tom Taulli, Ankit Verma
August 2020
Beginner to intermediate content levelBeginner to intermediate
452 pages
7h 42m
English
Packt Publishing
Content preview from The Natural Language Processing Workshop

3. Developing a Text Classifier

Overview

This chapter starts with an introduction to the various types of machine learning methods, that is, the supervised and unsupervised methods. You will learn about hierarchical clustering and k-means clustering and implement them using various datasets. Next, you will explore tree-based methods such as random forest and XGBoost. Finally, you will implement an end-to-end text classifier in order to categorize text on the basis of its content.

Introduction

In the previous chapters, you learned about various extraction methods, such as tokenization, stemming, lemmatization, and stop-word removal, which are used to extract features from unstructured text. We also discussed Bag of Words and Term Frequency-Inverse ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

The Applied AI and Natural Language Processing Workshop

The Applied AI and Natural Language Processing Workshop

Krishna Sankar, Jeffrey Jackovich, Ruze Richards
Natural Language Processing and Computational Linguistics

Natural Language Processing and Computational Linguistics

Brian Sacash, Bhargav Srinivasa-Desikan, Reddy Anil Kumar
The Applied Data Science Workshop - Second Edition

The Applied Data Science Workshop - Second Edition

Alex Galea, Paul Van Branteghem, Guillermina Bea j, Shovon Sengupta, Karen Yang

Publisher Resources

ISBN: 9781800208421Supplemental Content