book

Python Natural Language Processing Cookbook

Name: Python Natural Language Processing Cookbook
Author: Zhenya Antić
ISBN: 9781838987312

by Zhenya Antić

March 2021

Beginner to intermediate

284 pages

English

Packt Publishing

Read now

Unlock full access

Python Natural Language Processing Cookbook
ContributorsAbout the authorAbout the reviewers
Preface
Who this book is forWhat this book coversTo get the most out of this bookDownload the example code filesDownload the color imagesConventions usedSectionsGetting readyHow to do it…How it works…There's more…See alsoGet in touchReviews
Chapter 1: Learning NLP Basics
Technical requirementsDividing text into sentencesGetting readyHow to do it…How it works…There's more…See alsoDividing sentences into words – tokenizationGetting readyHow to do it…How it works…There's more…See alsoParts of speech taggingGetting readyHow to do it…How it works…There's more…See alsoWord stemmingGetting readyHow to do it…How it works…There's more…See alsoCombining similar words – lemmatizationGetting readyHow to do it…How it works…There's more…Removing stopwordsGetting ready…How to do it…How it works…There's more…
Chapter 2: Playing with Grammar
Technical requirementsCounting nouns – plural and singular nounsGetting readyHow to do it…How it works…There's more…Getting the dependency parseGetting readyHow to do it…How it works…See alsoSplitting sentences into clausesGetting readyHow to do it…How it works…Extracting noun chunksGetting readyHow to do it…How it works…There's more…See alsoExtracting entities and relationsGetting readyHow to do it…How it works…There's more…Extracting subjects and objects of the sentenceGetting readyHow to do it…How it works…There's more…Finding references – anaphora resolutionGetting readyHow to do it…How it works…There's more…
Chapter 3: Representing Text – Capturing Semantics
Technical requirementsPutting documents into a bag of wordsGetting readyHow to do it…How it works…There's more…Constructing the N-gram modelGetting readyHow to do it…How it works…There's more…Representing texts with TF-IDFGetting readyHow to do it…How it works…There's more…Using word embeddingsGetting readyHow to do it…How it works…There's more…See alsoTraining your own embeddings modelGetting readyHow to do it…How it works…There's more…See alsoRepresenting phrases – phrase2vecGetting readyHow to do it…How it works…See alsoUsing BERT instead of word embeddingsGetting readyHow to do it…How it works…Getting started with semantic searchGetting readyHow to do it…How it works…See also
Chapter 4: Classifying Texts
Technical requirementsGetting the dataset and evaluation baseline readyGetting readyHow to do it…How it works…Performing rule-based text classification using keywordsGetting readyHow to do it…How it works…There's more…Clustering sentences using K-means – unsupervised text classificationGetting readyHow to do it…How it works…Using SVMs for supervised text classificationGetting readyHow to do it…How it works…There's more…Using LSTMs for supervised text classificationGetting readyHow to do it…How it works…
Chapter 5: Getting Started with Information Extraction
Technical requirementsUsing regular expressionsGetting readyHow to do it…How it works…There's more…Finding similar strings: the Levenshtein distanceGetting readyHow to do it…How it works…There's more…See alsoPerforming named entity recognition using spaCyGetting readyHow to do it…How it works…There's more…Training your own NER model with spaCyGetting readyHow to do it…How it works…There's more…See alsoDiscovering sentiment analysisGetting readyHow to do it…How it works…Sentiment for short texts using LSTM: TwitterGetting readyHow to do it…How it works…Using BERT for sentiment analysisGetting readyHow to do it…How it works…There's more…See also
Chapter 6: Topic Modeling
Technical requirementsLDA topic modeling with sklearnGetting readyHow to do it…How it works…There's more…LDA topic modeling with gensimGetting readyHow to do it…How it works…There's more…NMF topic modelingGetting readyHow to do it…How it works…K-means topic modeling with BERTGetting readyHow to do it…How it works…Topic modeling of short textsGetting readyHow to do it…How it works…See also
Chapter 7: Building Chatbots
Technical requirementsBuilding a basic chatbot with keyword matchingGetting readyHow to do it…How it works…There's more…Building a basic Rasa chatbotGetting readyHow to do it…How it works…There's more…See alsoCreating question-answer pairs with RasaGetting readyHow to do it…How it works…Creating and visualizing conversation paths with RasaGetting readyHow to do it…How it works…Creating actions for the Rasa chatbotGetting readyHow to do it…How it works…See also
Chapter 8: Visualizing Text Data
Technical requirementsVisualizing the dependency parseGetting readyHow to do it…How it works…Visualizing parts of speechGetting readyHow to do it…How it works…Visualizing NERGetting readyHow to do it…How it works…Constructing word cloudsGetting readyHow to do it…How it works…There's more…See alsoVisualizing topicsGetting readyHow to do it…How it works…See also

Why subscribe?
Other Books You May EnjoyPackt is searching for authors like youLeave a review - let other readers know what you think

Content preview from Python Natural Language Processing Cookbook

Chapter 5: Getting Started with Information Extraction

In this chapter, we will cover the basics of information extraction. We will start with extracting emails and URLs from job announcements. Then we will use an algorithm called the Levenshtein distance to find similar strings. Next, we will use spaCy to find named entities in text, and later we will train our own named entity recognition (NER) model in spaCy. We will then do basic sentiment analysis, and finally, we will train two custom sentiment analysis models.

You will learn how to use existing tools and train your own models for information extraction tasks.

We will cover the following recipes in this chapter:

Using regular expressions
Finding similar strings: the Levenshtein distance ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781838987312

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Python Natural Language Processing Cookbook

by Zhenya Antić

Chapter 5: Getting Started with Information Extraction

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.