Multiprocessing pools

In general, there is no reason to have more processes than there are processors on the computer. There are a few reasons for this:

  • Only cpu_count() processes can run simultaneously
  • Each process consumes resources with a full copy of the Python interpreter
  • Communication between processes is expensive
  • Creating processes takes a non-zero amount of time

Given these constraints, it makes sense to create at most cpu_count() processes when the program starts and then have them execute tasks as needed. This has much less overhead than starting a new process for each task.

It is not difficult to implement a basic series of communicating processes that does this, but it can be tricky to debug, test, and get right. Of course, ...

Get Python 3 Object-Oriented Programming. - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.