Skip to Content
Natural Language Processing with Spark NLP
book

Natural Language Processing with Spark NLP

by Alex Thomas
June 2020
Beginner to intermediate
364 pages
8h 58m
English
O'Reilly Media, Inc.
Content preview from Natural Language Processing with Spark NLP

Chapter 12. Sentiment Analysis and Emotion Detection

Sentiment analysis is a set of techniques used for quantifying some sentiment based on text content. There are many community sites and e-commerce sites that allow users to comment and rate products and services. However, this is not the only place where people discuss products and services—there is also social media. We can leverage the data from the sites with comments and ratings to learn the relationship between the language used and positive or negative sentiment. These approaches can be extended to predicting the emotions of the author of a piece of text. Sentiment analysis is one of the most popular uses of NLP.

For this application, we are trying to build a program that we can use to quantify movie reviews. Although many, but not all, movie reviewers use some quantifiable metrics—for example, thumbs up/down, stars, or letter grades, these are not normalized. Two reviewers who use a 10-point scale may have different distributions. One reviewer may give most movies a 4–6 range, where another gives a 6–8 range. We could normalize them, but what about the other reviewers who use different metrics or no metrics at all? It might be better if we build a model that looks at the reviews and produces a score. This way, we know that the scores from a given reviewer are based on the text of the review, instead of on an ad-hoc score.

Problem Statement and Constraints

  1. What is the problem we are trying to solve?

    We want to build ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Natural Language Processing (NLP)

Natural Language Processing (NLP)

Bruno Goncalves

Publisher Resources

ISBN: 9781492047759Errata Page