O'Reilly logo

Apache Spark for Data Science Cookbook by Padma Priya Chitturi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6. NLP with Spark

In this chapter, we will see how to run NLP algorithms over Spark. You will learn the following recipes:

  • Installing NLTK on Linux
  • Installing Anaconda on Linux
  • Anaconda for cluster management
  • POS tagging with PySpark on an Anaconda cluster
  • Named Entity Recognition with IPython over Spark
  • Implementing openNLP - chunker over Spark
  • Implementing openNLP - sentence detector over Spark
  • Implementing stanford NLP - lemmatization over Spark
  • Implementing sentiment analysis using stanford NLP over Spark

Introduction

The study of natural language processing is called NLP. It is about the application of computers on different language nuances and building real-world applications using NLP techniques. NLP is analogous to teaching a language to ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required