Video description
How do software programs that automatically extract information from web pages actually work? This video course, based on content from the book "Mining the Social Web" (O'Reilly Media) by Matthew Russell, teaches you how to create machines that can navigate the internet, cut through the noise, and extract the most important textual content from any web page or group of web pages. You'll learn how to use Python to write programs that can crawl, scrape, and parse the web; as well as discover how to extract key terms and sentences from web mined documents, explore document summarization techniques used in natural language processing and artificial intelligence, and gain experience using Python’s Natural Language Toolkit (NLTK) to auto-summarize web articles. Learners should have a basic understanding of Python.
- Understand how helpful (and malicious) web bots crawl, parse, and index the web
- Learn how to scrape content, extract links, and parse information from web pages and blog feeds
- Discover how Python’s Natural Language Toolkit (NLTK) extracts and summarizes content
After completing his PhD in astrophysics, Mikhail Klassen transitioned to data science and refined his expertise in data mining, data analysis, and machine learning. He's now the Chief Data Scientist for Paladin: Paradigm Knowledge Solutions in Montreal, where he combines data mining and artificial intelligence to deliver personalized training for the aerospace industry.
Publisher resources
Product information
- Title: Mining the Social Web - Web Pages
- Author(s):
- Release date: September 2017
- Publisher(s): Infinite Skills
- ISBN: 9781491989883
You might also like
video
Complete Git Guide: Understand and Master Git and GitHub
Complete with practical activities, this comprehensive Git and GitHub guide will help you understand how Git …
video
Hands-on Development in AWS
12 hours of in-depth AWS Developer services training Overview Learn to create applications using AWS services …
video
Elasticsearch 8 and the Elastic Stack: In-Depth and Hands-On
Elasticsearch 8 is a powerful tool for analyzing big datasets in a matter of milliseconds! It’s …
video
REST APIs with Flask and Python in 2023
A REST API is an application that accepts data from clients and returns data back. For …