Skip to Content
Introduction to Computing Using Python: An Application Development Focus
book

Introduction to Computing Using Python: An Application Development Focus

by Ljubomir Perkovic
December 2011
Beginner
508 pages
13h 42m
English
Wiley
Content preview from Introduction to Computing Using Python: An Application Development Focus

CHAPTER 11

The Web and Search

11.1 The World Wide Web

11.2 Python WWW API

11.3 String Pattern Matching

11.4 Case Study: Web Crawler

Chapter Summary

Solutions to Practice Problems

Exercises

Problems

IN THIS CHAPTER, we introduce the World Wide Web (the WWW or simply the web). The web is one of the most important development in computer science. It has become the platform of choice for sharing information and communicating. Consequently the web is a rich source for cutting-edge application development.

We start this chapter by describing the three core WWW technologies: Uniform Resource Locators (URLs), the HyperText Transfer Protocol (HTTP), and the HyperText Markup Language (HTML). We focus especially on HTML, the language of web pages. We then go over the Standard Library modules that enable developers to write programs that access, download, and process documents on the web. We focus, in particular, on mastering tools such as HTML parsers and regular expressions that help us process web pages and analyze the content of text documents.

In this chapter's case study, we develop a web crawler, that is, a program that “crawls through the web.” Our crawler analyzes the content of each visited web page and works by calling itself recursively on every link out of the web page. The crawler is the first step in the development of a search engine, which we do in Chapter 12.

11.1 The World Wide Web

The World Wide Web (WWW or, simply, the web) is a distributed system of documents linked through ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Getting Started with Python

Getting Started with Python

Fabrizio Romano, Benjamin Baka, Dusty Phillips

Publisher Resources

ISBN: 9781118213568Purchase book