O'Reilly logo

Parallel Programming with Python by Jan Palach

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Using PP to make a distributed Web crawler

Now that we have executed the codes in parallel using PP to dispatch the local processes, it is time to verify that the code is executed in a distributed way. For this, we will use the following three different machines:

  • Iceman-Thinkad-X220: Ubuntu 13.10
  • Iceman-Q47OC-500P4C: Ubuntu 12.04 LTS
  • Asgard-desktop: Elementary OS

The idea is to dispatch the executions to the three machines listed using PP. For this, we will make use of a case study of the Web crawler. In the code of web_crawler_pp_cluster.py, for each URL informed in the input_list, we will dispatch a local or remote process for execution, and at the end of each execution, a callback function will group the URLs and their first three links found. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required