Video description
The course is designed for engineers and data scientists who have some familiarity with Scala, Apache Spark, and machine learning who need to process large natural language text in a distributed fashion.We will use sample of posts from the subreddit /r/WritingPrompts, which contains short stories and comments about the short stories.The course has four parts1. Building a natural language processing and entity extraction pipeline on Scala & Spark2. Machine Learning Applications for Statistical Natural Language Understanding at Scale3. Topic Modeling on Natural Language with Scala, Spark and MLLib4. Deep Learning Applications for Natural Language Understanding with Scala, Spark and MLLibYou will learn how use Apache Spark to process text with annotations, use machine learning with your annotations, create and use topic models, create and use a word2vec model.
Table of contents
- Welcome to the Course
- Part 1: Building a natural language processing and entity extraction pipeline on Scala Spark
- Part 2: Machine Learning Applications for Statistical Natural Language Understanding at Scale
- Part 3: Topic Modeling on Natural Language with Scala, Spark and MLLib
- Part 4: Deep Learning Applications for Natural Language Understanding with Scala, Spark and MLLib
Product information
- Title: Building Pipelines for Natural Language Understanding with Spark
- Author(s):
- Release date: December 2016
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491978122
You might also like
book
The Self-Service Data Roadmap
Data-driven insights are a key competitive advantage for any industry today, but deriving insights from raw …
book
Deciphering Data Architectures
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern …
book
Radar Trends to Watch: September 2023
Read about the latest developments on O'Reilly Media's Radar.
book
Database Internals
When it comes to choosing, using, and maintaining a database, understanding its internals is essential. But …