Debugging the Garbage-Collection Process

Credit: Dirk Holtwick

Problem

You know that memory is leaking from your program, but you have no indication of what exactly is being leaked. You need more information to help you figure out where the leaks are coming from, so you can remove them and lighten the garbage-collection work periodically performed by the standard gc module.

Solution

The gc module lets you dig into garbage-collection issues:

import gc

def dump_garbage(  ):
    """
    show us what the garbage is about
    """
    # Force collection
    print "\nGARBAGE:"
    gc.collect(  )

    print "\nGARBAGE OBJECTS:"
    for x in gc.garbage:
        s = str(x)
        if len(s) > 80: s = s[:77]+'...'
        print type(x),"\n  ", s

if _ _name_ _=="_ _main_ _":
    gc.enable(  )
    gc.set_debug(gc.DEBUG_LEAK)

    # Make a leak
    l = []
    l.append(l)
    del l

    # show the dirt ;-)
    dump_garbage(  )

Discussion

In addition to the normal debugging output of gc, this recipe shows the garbage objects to help you get an idea of where the leak may be. Situations that could lead to garbage collection should be avoided. Most of the time, they’re caused by objects that refer to themselves, or similar reference loops (also known as cycles).

Once you’ve found where the reference loops are coming from, Python offers all the needed tools to remove them, particularly weak references (in the weakref standard library module). But especially in big programs, you first have to get an idea of where to find the leak before you can remove it and enhance your program’s performance. For this, it’s good to know what the objects being leaked contain, so the dump_garbage function in this recipe can come in quite handy on such occasions.

This recipe works by first calling gc.set_debug to tell the gc module to keep the leaked objects in its gc.garbage list rather than recycling them. Then, this recipe’s dump_garbage function calls gc.collect to force a garbage-collection process to run, even if there is still ample free memory, so it can examine each item in gc.garbage and print out its type and contents (limiting the printout to no more than 80 characters to avoid flooding the screen with huge chunks of information).

See Also

Documentation for the gc module in the Library Reference.

Get Python Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.