Get Started with Natural Language Processing Using Python, Spark, and Scala

by O'Reilly Media, Inc.

Released March 2017

Publisher(s): O'Reilly Media, Inc.

ISBN: 9781491985847

Start your free trial

Video description

Whether you’re a programmer with little to no knowledge of Python, or an experienced data scientist or engineer, this course will walk you through natural language processing, using both Python and Scala, and show you how to implement a range of popular tools including Spark, scikit-learn, SpaCy, NLTK, and gensim for text mining.

You’ll learn the most common techniques for processing text, how to use machine learning to generate annotators and apply them within a data pipeline, and the differences between NLP pipelines and other approaches to semantic text mining. You’ll learn about standard UIMA annotators, custom annotators, and machine-learned annotators, and understand how architectures for text processing pipelines can incorporate some of the most popular big data tools such as Kafka, Spark, SparkSQL, Cassandra, and ElasticSearch.

By the end of the course, you will be able to build a natural language processing and entity extraction pipeline, and will have a complete understanding of the capabilities and limitations of natural language text processing.

Materials or downloads needed in advance: Example files

Product information

Title: Get Started with Natural Language Processing Using Python, Spark, and Scala
Author(s): O'Reilly Media, Inc.
Release date: March 2017
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781491985847