Chapter 4. Other Implementations

Several implementations of the Python interpreter exist other than the CPython implementation written and maintained by the Python Software Foundation (that is, actually written by volunteer core contributors, but with the intellectual property maintained by the PSF). The version of Python you can download directly from the website—and the version included in most operating system distributions that have any version—is CPython, so named because it is written in the C programming language (although a lot of bootstrapping is done in Python itself, even for CPython).


A fascinating, arcane, and remarkable project is PyPy, an implementation of the Python interpreter written in … Python. Well, OK, technically it is written in a constrained subset of Python called RPython, but every RPython program is a Python program, in any case.

The PyPy website describes the project this way:

PyPy is a fast, compliant alternative implementation of the Python language (2.7.8 and 3.2.5). It has several advantages and distinct features:

  • Speed: thanks to its Just-in-Time compiler, Python programs often run faster on PyPy.

  • Memory usage: memory-hungry Python programs (several hundreds of MBs or more) might end up taking less space than they do in CPython.

  • Compatibility: PyPy is highly compatible with existing python code. It supports cffi and can run popular Python libraries like twisted and django.

  • Sandboxing: PyPy provides the ability to run untrusted code in a fully secure way.

  • Stackless: PyPy comes by default with support for stackless mode, providing micro-threads for massive concurrency.

PyPy is often vastly faster than CPython, especially in computational/numeric code (think 5 to 100 times faster, sometimes rivaling C or Fortran). It is stable and well researched. Some sticking points with library support remain, especially support libraries written in C.

There is active interaction between CPython core developers and PyPy developers, and often ideas from one source wind up influencing or being borrowed by the other. As well, the Python Software Foundation has, from time to time, funded specific efforts within PyPy development.


It is a quirky bit of the culture of Python to speak of complex programming concepts or libraries as “melting your brain”—and no project has this said of it quite as often as does PyPy (for people who try to actually understand the internals; installing and running it is no harder than any other implementation). That is, nothing prior to PyPy-STM ever earned this honor.

PyPy-STM is just past experimental, but it promises possibilities of huge additional speedups over regular PyPy for high concurrency on multi-core systems. PyPy-STM is a version of PyPy that implements Software Transactional Memory (STM). According to Wikipedia:

[STM is a] concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. […]

STM is very optimistic: a thread completes modifications to shared memory without regard for what other threads might be doing, recording every read and write that it is performing in a log. […] the reader, […] after completing an entire transaction, verifies that other threads have not concurrently made changes to memory that it accessed in the past.

You can think of STM as speculative threading—i.e., perhaps among multiple cores, avoiding the global interpreter lock (GIL). According to the PyPy-STM docs:

Threads execute their code speculatively, and at known points (e.g. between bytecodes) they coordinate with each other to agree on which order their respective actions should be “committed”, i.e. become globally visible. Each duration of time between two commit-points is called a transaction.

The bottom line is that PyPy can be a lot faster than CPython, but cannot easily gain further from multiple cores on a CPU. After a roughly constant overhead hit, PyPy-STM concurrent performance starts to scale roughly linearly with the number of cores in performance:

# of threads PyPy (head) PyPy-STM (2.3r2)

N = 1

real 0.92s

real 1.34s

N = 2

real 1.77s

real 1.39s

N = 3

real 2.57s

real 1.58s

N = 4

real 3.38s

real 1.64s


Jython is an implementation of the Python programming language that is designed to run on the Java™ platform. It contains a compiler that transforms Python source code to Java bytecodes, which can then run on a JVM. It also includes Python standard library modules which are used by the compiled Java bytecodes. But of greatest relevance for users, Jython lets users import and use Java packages as if they were Python modules.

Jython is highly compatible with CPython when both rely on pure Python modules or standard library modules. The difference arises in that Jython can access all of the packages developers have written in Java or other JVM-compatible languages, whereas CPython can access C extension modules written for use with CPython.

Although JVMs use just-in-time compilation and other optimization techniques, Jython winds up with similar performance as CPython overall, so unlike with PyPy there is no performance gain (nor any loss on average; obviously specific micro-benchmarks will vary, perhaps widely). Unfortunately, Jython so far only supports the Python 2.x series.

Here is a quick example of use, taken from the Jython documentation:

>>> from java.util import Random
>>> r = Random()
>>> r.nextInt()

Obviously, Python itself—including Jython—has its own random module with similar functionality. But the code sample illustrates the seamless use of Java packages in a simple case.


IronPython is a lot like Jython overall, merely substituting the .NET Framework and the CLR in place of Java packages and JVMs. Like Jython, IronPython is highly compatible with CPython (assuming pure Python or standard library modules are used), has similar overall performance, and is incompatible with C extension modules written for use with CPython.

IronPython is useful for Python programmers who want to access the various libraries written for the .NET Framework, whatever the supported language they were written in. Like Jython, however, it so far only supports the Python 2.x series.

Here is a similar quick example to the one given for Jython, taken from the IronPython documentation:

>>> from System.Collections.Generic import List, Dictionary
>>> int_list = List[int]()
>>> str_float_dict = Dictionary[str, float]()

In this case, Python has list and dict, but they are not generics in that they cannot be typed in manner shown in the example. Third parties may well have written analogous collections in Python, or as C extension modules, but they are not part of the standard library of CPython.


Cython is an enhanced version of Python chiefly used for writing extension modules. It allows for optional type annotations to make large speed gains in numeric code (and in some other cases). The Cython website describes it like so:

Cython is an optimising static compiler for both the Python programming language and the extended Cython programming language. […]

The Cython language is a superset of the Python language that additionally supports calling C functions and declaring C types on variables and class attributes. This allows the compiler to generate very efficient C code from Cython code. The C code is generated once and then compiles with all major C/C++ compilers in CPython 2.6, 2.7 (2.4+ with Cython 0.20.x) as well as 3.2 and all later versions.

Let us borrow a simple example from the Cython documentation:

Consider the following pure Python code:

def f(x):
 return x**2-x
def integrate_f(a, b, N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
         s += f(a+i*dx)
    return s * dx

Simply compiling this in Cython merely gives a 35% speedup. This is better than nothing, but adding some static types can make a much larger difference.

With additional type declarations, this might look like:

def f(double x):
    return x**2-x
def integrate_f(double a, double b, int N):
    cdef int i
    cdef double s, dx
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f(a+i*dx)
    return s * dx

Since the iterator variable i is typed with C semantics, the for-loop will be compiled to pure C code. Typing a, s and dx is important as they are involved in arithmetic within the for-loop; typing b and N makes less of a difference, but in this case it is not much extra work to be consistent and type the entire function.

This results in a 4 times speedup over the pure Python version.

While you could make Cython your general Python interpreter, in practice it is used as an adjunct to CPython, with particular modules that are performance criticial being compiled as C extension modules. Cython modules use the extension .pyx, but produce an intermediate .c file that is compiled to an actual .so or .pyd module. Hence the result genuinely is a C extension module, albeit one whose code was auto-generated from a Python superset that is likely to be far more readable than writing in C to start with.


Numba is not technically a Python implementation, but rather “merely” a C extension module for CPython. However, what it does is akin to a mixture of PyPy and Cython. Without any semantic changes, importing and using the decorators supplied by numba causes just-in-time compilation and optimization of code on a per-function basis, substituting a fast machine code path for CPython interpretation of the decorated function. Like PyPy, this gives you dynamic compilation, type inference, and just-in-time optimization; like Cython, you can also annotate types explicitly where they are known in advance. Moreover, Numba “plays well” with NumPy, so using both together can produce extremely fast execution.

Quoting from the Numba documentation:

Numba gives you the power to speed up your applications with high performance functions written directly in Python.

Numba generates optimized machine code from pure Python code using the LLVM compiler infrastructure. With a few simple annotations, array-oriented and math-heavy Python code can be just-in-time optimized to performance similar as C, C++ and Fortran, without having to switch languages or Python interpreters.

Numba’s main features are:

  • on-the-fly code generation (at import time or runtime, at the user’s preference)

  • native code generation for the CPU (default) and GPU hardware integration with the Python scientific software stack (thanks to Numpy)

Here is how a Numba-optimized function, taking a Numpy array as argument, might look like:

def sum2d(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result

[…] You can also tell Numba the function signature you are expecting. The function f() would now look like:

from numba import jit, int32
@jit(int32(int32, int32))
def f(x, y):
    # A somewhat trivial example
    return x + y

Get Picking a Python Version: A Manifesto now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.