Chapter 12. Sentiment Analysis and Emotion Detection

Sentiment analysis is a set of techniques used for quantifying some sentiment based on text content. There are many community sites and e-commerce sites that allow users to comment and rate products and services. However, this is not the only place where people discuss products and services—there is also social media. We can leverage the data from the sites with comments and ratings to learn the relationship between the language used and positive or negative sentiment. These approaches can be extended to predicting the emotions of the author of a piece of text. Sentiment analysis is one of the most popular uses of NLP.

For this application, we are trying to build a program that we can use to quantify movie reviews. Although many, but not all, movie reviewers use some quantifiable metrics—for example, thumbs up/down, stars, or letter grades, these are not normalized. Two reviewers who use a 10-point scale may have different distributions. One reviewer may give most movies a 4–6 range, where another gives a 6–8 range. We could normalize them, but what about the other reviewers who use different metrics or no metrics at all? It might be better if we build a model that looks at the reviews and produces a score. This way, we know that the scores from a given reviewer are based on the text of the review, instead of on an ad-hoc score.

Problem Statement and Constraints

  1. What is the problem we are trying to solve?

    We want to build ...

Get Natural Language Processing with Spark NLP now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.