Skip to Content
Natural Language Processing (NLP)
on-demand course

Natural Language Processing (NLP)

with Bruno Goncalves
October 2018
Intermediate
2h 21m
English
Pearson
Closed Captioning available in English, Japanese, Korean, Chinese (Simplified), Chinese (Traditional)

Overview

2+ Hours of Video Instruction

Overview

Natural Language Processing LiveLessons covers the fundamentals of natural language processing (NLP). It introduces you to the basic concepts, ideas, and algorithms necessary to develop your own NLP applications in a step-by-step and intuitive fashion. The lessons follow a gradual progression, from the more specific to the more abstract, taking you from the very basics to some of the most recent and sophisticated algorithms.

About the Instructor

Bruno Goncalves is currently a Senior Data Scientist working at the intersection of Data Science and Finance. Previously, he was a Data Science fellow at NYU’s Center for Data Science while on leave from a tenured faculty position at Aix-Marseille Universite. Since completing his PhD in the Physics of Complex Systems in 2008 he has been pursuing the use of Data Science and Machine Learning to study Human Behavior. Using large datasets from Twitter, Wikipedia, web access logs, and Yahoo! Meme he studied how we can observe both large scale and individual human behavior in an obtrusive and widespread manner. The main applications have been to the study of Computational Linguistics, Information Diffusion, Behavioral Change and Epidemic Spreading. In 2015 he was awarded the Complex Systems Society's 2015 Junior Scientific Award for “outstanding contributions in Complex Systems Science” and in 2018 is was named a Science Fellow of the Institute for Scientific Interchange in Turin, Italy.

Skill Level

  • Intermediate

Learn How To

  • Represent text
  • Model topics
  • Conduct sentiment analysis
  • Understand word2vec word embeddings
  • Define GloVe
  • Apply language detection

Who Should Take This Course

Data scientists with an interest in natural language processing

Course Requirements

  • Basic algebra
  • Calculus and statistics
  • Programming experience

Lesson Descriptions

Lesson 1: Text Representations
The first step in any NLP application is to establish the representations of text and numbers. One-hot encodings provide us with a sparse approach to representing words and n-grams, while bag-of-words improves memory efficiency even further. Naturally, not all words are meaningful, so the next steps are to remove meaningless stop words and to identify the most relevant words for our application using term frequency/inverse document frequency (TF/IDF). Finally, the lesson covers how to identify the stems of words so you can meaningfully reduce the size of your vocabulary.

Lesson 2: Topic Modeling
Lesson 2 builds on the text representations of Lesson 1 to develop ways of identifying the main subject or subjects of a text. Bruno starts by defining topics and how they can be identified. Next, you learn how to perform explicit semantic analysis to find documents mentioning a specific topic and how to cluster documents according to topics. Latent semantic analysis provides yet another powerful way to extract meaning from raw text, while non-negative matrix factorization enables you to identify latent dimensions in the text, perform recommendations, and measure similarities.

Lesson 3: Sentiment Analysis
After covering how to represent text in a meaningful way and identifying the topics covered in a document, we now focus on how to extract sentiment information. In other words, what kind of sentiments are being expressed? Are the words used positive or negative? The next step is to consider corpus-based approaches to defining the valence of each word and, finally, how to handle negations and modifiers.

Lesson 4: Applications
The first three lessons covered the fundamental tools of NLP, and now you are ready to consider specific applications and advanced topics. Perhaps one of the most important developments in NLP in recent years is the popularization of word embeddings in general and word2vec in particular. This enables you to delve deeper into vector representations of words and concepts, and to understand how semantic relations can be expressed through vector algebra. GloVe is the main competitor to word2vec, and this lesson also explores its advantages and disadvantages. As the final application of NLP and the last section in our course, we consider the question of language detection.

About Pearson Video Training

Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Prentice Hall, Sams, and Que Topics include: IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development, and more. Learn more about Pearson Video training at http://www.informit.com/video.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Natural Language Processing with Spark NLP

Natural Language Processing with Spark NLP

Alex Thomas
Natural Language Processing in Action

Natural Language Processing in Action

Cole Howard, Hobson Lane, Hannes Hapke

Publisher Resources

ISBN: 9780135258842