O'Reilly logo

Python in a Nutshell, 3rd Edition by Steve Holden, Anna Ravenscroft, Alex Martelli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4. Object-Oriented Python

Python is an object-oriented (OO) programming language. Unlike some other object-oriented languages, Python doesn’t force you to use the object-oriented paradigm exclusively: it also supports procedural programming with modules and functions, so you can select the best paradigm for each part of your program. The object-oriented paradigm helps you group state (data) and behavior (code) together in handy packets of functionality. It’s also useful when you want to use some of Python’s object-oriented mechanisms covered in this chapter, such as inheritance or special methods. The procedural paradigm, based on modules and functions, may be simpler and more suitable when you don’t need the pluses of object-oriented programming. With Python, you can mix and match paradigms.

Python today has an object model different from that of many years ago. This chapter exclusively describes the so-called new-style, or new object model, which is simpler, more regular, more powerful, and the one we recommend you always use; whenever we speak of classes or instances, we mean new-style classes or instances. However, for backward compatibility, the default object model in v2 is the legacy object model, also known as the classic or old-style object model; the new-style object model is the only one in v3. For legacy information about the old style, see some ancient online docs from 2002.

This chapter also covers special methods, in “Special Methods”, and advanced concepts known as abstract base classes, in “Abstract Base Classes”; decorators, in “Decorators”; and metaclasses, in “Metaclasses”.

Classes and Instances

If you’re familiar with object-oriented programming in other OO languages such as C++ or Java, you probably have a good intuitive grasp of classes and instances: a class is a user-defined type, which you instantiate to build instances, meaning objects of that type. Python supports these concepts through its class and instance objects.

Python Classes

A class is a Python object with several characteristics:

  • You can call a class object as if it were a function. The call, often known as instantiation, returns an object known as an instance of the class; the class is known as the type of the instance.

  • A class has arbitrarily named attributes that you can bind and reference.

  • The values of class attributes can be descriptors (including functions), covered in “Descriptors”, or normal data objects.

  • Class attributes bound to functions are also known as methods of the class.

  • A method can have a special Python-defined name with two leading and two trailing underscores (commonly known as dunder names, short for “double-underscore names”—the name __init__, for example, is often pronounced as “dunder init”). Python implicitly calls such special methods, if a class supplies them, when various kinds of operations take place on instances of that class.

  • A class can inherit from other classes, meaning it delegates to other class objects the lookup of attributes that are not found in the class itself.

An instance of a class is a Python object with arbitrarily named attributes that you can bind and reference. An instance object implicitly delegates to its class the lookup of attributes not found in the instance itself. The class, in turn, may delegate the lookup to classes from which it inherits, if any.

In Python, classes are objects (values), handled like other objects. Thus, you can pass a class as an argument in a call to a function. Similarly, a function can return a class as the result of a call. A class, just like any other object, can be bound to a variable (local or global), an item in a container, or an attribute of an object. Classes can also be keys into a dictionary. The fact that classes are ordinary objects in Python is often expressed by saying that classes are first-class objects.

The class Statement

The class statement is the most common way to create a class object. class is a single-clause compound statement with the following syntax:

class classname(base-classes):
    statement(s)

classname is an identifier. It is a variable that gets bound (or rebound) to the class object after the class statement finishes executing.

base-classes is a comma-delimited series of expressions whose values must be class objects. These classes are known by different names in different programming languages; you can, at your choice, call them the bases, superclasses, or parents of the class being created. The class being created can be said to inherit from, derive from, extend, or subclass its base classes, depending on what programming language you are familiar with; in this book, we generally use extend. This class is also known as a direct subclass or descendant of its base classes. In v3 only, base-classes can include a named argument metaclass=... to establish the class’s metaclass, as covered in “How Python v3 Determines a Class’s Metaclass”.

Syntactically, base-classes is optional: to indicate that you’re creating a class without bases, you can omit base-classes (and, optionally, the parentheses around it), placing the colon right after the classname. However, in v2, a class without bases, for backward compatibility, is an old-style one (unless you define the __metaclass__ attribute, covered in “How Python v2 Determines a Class’s Metaclass”). To create a new-style class C without any “true” bases, in v2, code class C(object):; since every type extends the built-in object, specifying object as the value of base-classes just means that class C is new-style rather than old-style. If your class has ancestors (if any) that are all old-style, and does not define the __metaclass__ attribute, then your class is old-style; otherwise, a class with bases is always new-style (even if some of the bases are new-style and some are old-style). In v3, all classes are new-style (you can still “inherit from object” quite innocuously, if you wish, to make your code backward compatible with v2). In our examples, for uniformity, we always specify (object) as a base rather than leaving a class “base-less”; however, if you’re programming in and for v3 only, it’s more elegant, and recommended, to leave such “fake subclassing” out.

The subclass relationship between classes is transitive: if C1 extends C2, and C2 extends C3, then C1 extends C3. Built-in function issubclass(C1, C2) accepts two arguments that are class objects: it returns True if C1 extends C2; otherwise, it returns False. Any class is a subclass of itself; therefore, issubclass(C, C) returns True for any class C. We cover the way in which base classes affect a class’s functionality in “Inheritance”.

The nonempty sequence of indented statements that follows the class statement is known as the class body. A class body executes immediately as part of the class statement’s execution. Until the body finishes executing, the new class object does not yet exist, and the classname identifier is not yet bound (or rebound). “How a Metaclass Creates a Class” provides more details about what happens when a class statement executes.

Finally, note that the class statement does not immediately create any instance of the new class, but rather defines the set of attributes shared by all instances when you later create instances by calling the class.

The Class Body

The body of a class is where you normally specify the attributes of the class; these attributes can be descriptor objects (including functions) or normal data objects of any type (an attribute of a class can also be another class—so, for example, you can have a class statement “nested” inside another class statement).

Attributes of class objects

You normally specify an attribute of a class object by binding a value to an identifier within the class body. For example:

class C1(object):
    x = 23
print(C1.x)                               # prints: 23

The class object C1 has an attribute named x, bound to the value 23, and C1.x refers to that attribute.

You can also bind or unbind class attributes outside the class body. For example:

class C2(object): pass
C2.x = 23
print(C2.x)                               # prints: 23

Your program is usually more readable if you bind, and thus create, class attributes only with statements inside the class body. However, rebinding them elsewhere may be necessary if you want to carry state information at a class, rather than instance, level; Python lets you do that, if you wish. There is no difference between a class attribute created in the class body, and one created or rebound outside the body by assigning to an attribute.

As we’ll discuss shortly, all instances of the class share all of the class’s attributes.

The class statement implicitly sets some class attributes. Attribute __name__ is the classname identifier string used in the class statement. Attribute __bases__ is the tuple of class objects given as the base classes in the class statement. For example, using the class C1 we just created:

print(C1.__name__, C1.__bases__)
# prints: C1 (<type 'object'>,)

A class also has an attribute __dict__, the mapping object that the class uses to hold other attributes (AKA its namespace); in classes, this mapping is read-only.

In statements that are directly in a class’s body, references to attributes of the class must use a simple name, not a fully qualified name. For example:

class C3(object):
    x = 23
    y = x + 22                         # must use just x, not C3.x

However, in statements in methods defined in a class body, references to attributes of the class must use a fully qualified name, not a simple name. For example:

class C4(object):
    x = 23
    def amethod(self):
        print(C4.x)  # must use C4.x or self.x, not just x!

Note that attribute references (i.e., an expression like C.s) have semantics richer than those of attribute bindings. We cover these references in detail in “Attribute Reference Basics”.

Function definitions in a class body

Most class bodies include def statements, since functions (known as methods in this context) are important attributes for most class objects. A def statement in a class body obeys the rules presented in “Functions”. In addition, a method defined in a class body has a mandatory first parameter, conventionally named self, that refers to the instance on which you call the method. The self parameter plays a special role in method calls, as covered in “Bound and Unbound Methods”.

Here’s an example of a class that includes a method definition:

class C5(object):
    def hello(self):
        print('Hello')

A class can define a variety of special methods (methods with names that have two leading and two trailing underscores—these are also occasionally called “magic methods,” and frequently referenced verbally as “dunder” methods, e.g., “dunder init” for __init__) relating to specific operations on its instances. We discuss special methods in detail in “Special Methods”.

Class-private variables

When a statement in a class body (or in a method in the body) uses an identifier starting with two underscores (but not ending with underscores), such as __ident, the Python compiler implicitly changes the identifier into _classname__ident, where classname is the name of the class. This lets a class use “private” names for attributes, methods, global variables, and other purposes, reducing the risk of accidentally duplicating names used elsewhere, particularly in subclasses.

By convention, identifiers starting with a single underscore are meant to be private to the scope that binds them, whether that scope is or isn’t a class. The Python compiler does not enforce this privacy convention: it’s up to programmers to respect it.

Class documentation strings

If the first statement in the class body is a string literal, the compiler binds that string as the documentation string for the class. This attribute, named __doc__, is the docstring of the class. See “Docstrings” for more information on docstrings.

Descriptors

A descriptor is any object whose class supplies a special method named __get__. Descriptors that are class attributes control the semantics of accessing and setting attributes on instances of that class. Roughly speaking, when you access an instance attribute, Python gets the attribute’s value by calling __get__ on the corresponding descriptor, if any. For example:

class Const(object):        # an overriding descriptor, see later
    def __init__(self, value):
        self.value = value
    def __set__(self, *_):  # ignore any attempt at setting
        pass
    def __get__(self, *_):  # always return the constant value
        return self.value

class X(object):
    c = Const(23)

x=X()
print(x.c)  # prints: 23
x.c = 42
print(x.c)  # prints: 23

For more details, see “Attribute Reference Basics”.

Overriding and nonoverriding descriptors

If a descriptor’s class also supplies a special method named __set__, then the descriptor is known as an overriding descriptor (or, by an older, more widespread, and slightly confusing terminology, a data descriptor); if the descriptor’s class supplies only __get__ and not __set__, then the descriptor is known as a nonoverriding (or nondata) descriptor.

For example, the class of function objects supplies __get__, but not __set__; therefore, function objects are nonoverriding descriptors. Roughly speaking, when you assign a value to an instance attribute with a corresponding descriptor that is overriding, Python sets the attribute value by calling __set__ on the descriptor. For more details, see “Attributes of instance objects”.

Instances

To create an instance of a class, call the class object as if it were a function. Each call returns a new instance whose type is that class:

an_instance = C5()

You can call built-in function isinstance(i, C) with a class object as argument C. isinstance returns True when object i is an instance of class C or of any subclass of C. Otherwise, isinstance returns False.

__init__

When a class defines or inherits a method named __init__, calling the class object implicitly executes __init__ on the new instance to perform any needed per-instance initialization. Arguments passed in the call must correspond to the parameters of __init__, except for parameter self. For example, consider:

class C6(object):
    def __init__(self, n):
        self.x = n

Here’s how you can create an instance of the C6 class:

another_instance = C6(42)

As shown in the C6 class, the __init__ method typically contains statements that bind instance attributes. An __init__ method must not return a value other than None; if it does, Python raises a TypeError exception.

The main purpose of __init__ is to bind, and thus create, the attributes of a newly created instance. You may also bind, rebind, or unbind instance attributes outside __init__, as you’ll see shortly. However, your code is more readable when you initially bind all attributes of a class instance in the __init__ method.

When __init__ is absent (and is not inherited from any base), you must call the class without arguments, and the new instance has no instance-specific attributes.

Attributes of instance objects

Once you have created an instance, you can access its attributes (data and methods) using the dot (.) operator. For example:

an_instance.hello()                       # prints: Hello
print(another_instance.x)                 # prints: 42

Attribute references such as these have fairly rich semantics in Python; we cover them in detail in “Attribute Reference Basics”.

You can give an instance object an arbitrary attribute by binding a value to an attribute reference. For example:

class C7: pass
z = C7()
z.x = 23
print(z.x)                               # prints: 23

Instance object z now has an attribute named x, bound to the value 23, and z.x refers to that attribute. Note that the __setattr__ special method, if present, intercepts every attempt to bind an attribute. (We cover __setattr__ in Table 4-1.) When you attempt to bind on an instance an attribute whose name corresponds to an overriding descriptor in the class, the descriptor’s __set__ method intercepts the attempt: should C7.x be an overriding descriptor, the assignment z.x=23 would execute type(z).x.__set__(z, 23).

Creating an instance implicitly sets two instance attributes. For any instance z, z.__class__ is the class object to which z belongs, and z.__dict__ is the mapping that z uses to hold its other attributes. For example, for the instance z we just created:

print(z.__class__.__name__, z.__dict__)    # prints: C7 {'x':23}

You may rebind (but not unbind) either or both of these attributes, but this is rarely necessary.

For any instance z, any object x, and any identifier S (except __class__ and __dict__), z.S=x is equivalent to z.__dict__['S']=x (unless a __setattr__ special method, or an overriding descriptor’s __set__ special method, intercepts the binding attempt). For example, again referring to the z we just created:

z.y = 45
z.__dict__['z'] = 67
print(z.x, z.y, z.z)                        # prints: 23 45 67

There is no difference between instance attributes created by assigning to attributes and those created by explicitly binding an entry in z.__dict__.

The factory-function idiom

It’s often necessary to create instances of different classes depending on some condition, or to avoid creating a new instance if an existing one is available for reuse. A common misconception is that such needs might be met by having __init__ return a particular object, but such an approach is unfeasible: Python raises an exception if __init__ returns any value other than None. The best way to implement flexible object creation is by using a function, rather than calling the class object directly. A function used this way is known as a factory function.

Calling a factory function is a flexible approach: a function may return an existing reusable instance, or create a new instance by calling whatever class is appropriate. Say you have two almost interchangeable classes (SpecialCase and NormalCase) and want to flexibly generate instances of either one of them, depending on an argument. The following appropriate_case factory function, as a “toy” example, allows you to do just that (we cover the role of the self parameter in “Bound and Unbound Methods”):

class SpecialCase(object):
    def amethod(self): print('special')
class NormalCase(object):
    def amethod(self): print('normal')
def appropriate_case(isnormal=True):
    if isnormal: return NormalCase()
    else: return SpecialCase()
aninstance = appropriate_case(isnormal=False)
aninstance.amethod()                  # prints: special

__new__

Each class has (or inherits) a class method named __new__ (we cover class methods in “Class methods”). When you call C(*args,**kwds) to create a new instance of class C, Python first calls C.__new__(C,*args,**kwds). Python uses __new__’s return value x as the newly created instance. Then, Python calls C.__init__(x,*args,**kwds), but only when x is indeed an instance of C or any of its subclasses (otherwise, x’s state remains as __new__ had left it). Thus, for example, the statement x=C(23) is equivalent to:

x = C.__new__(C, 23)
if isinstance(x, C): type(x).__init__(x, 23)

object.__new__ creates a new, uninitialized instance of the class it receives as its first argument. It ignores other arguments when that class has an __init__ method, but it raises an exception when it receives other arguments beyond the first, and the class that’s the first argument does not have an __init__ method. When you override __new__ within a class body, you do not need to add __new__=classmethod(__new__), nor use an @classmethod decorator, as you normally would: Python recognizes the name __new__ and treats it specially in this context. In those extremely rare cases in which you rebind C.__new__ later, outside the body of class C, you do need to use C.__new__=classmethod(whatever).

__new__ has most of the flexibility of a factory function, as covered in “The factory-function idiom”. __new__ may choose to return an existing instance or make a new one, as appropriate. When __new__ does need to create a new instance, it most often delegates creation by calling object.__new__ or the __new__ method of another superclass of C. The following example shows how to override class method __new__ in order to implement a version of the Singleton design pattern:

class Singleton(object):
    _singletons = {}
    def __new__(cls, *args, **kwds):
        if cls not in cls._singletons:
            cls._singletons[cls] = super(Singleton, cls).__new__(cls)
        return cls._singletons[cls]

(We cover built-in super in “Cooperative superclass method calling”.)

Any subclass of Singleton (that does not further override __new__) has exactly one instance. If the subclass defines __init__, the subclass must ensure its __init__ is safe when called repeatedly (at each creation request) on the one and only class instance; that is because __init__, on any subclass of Singleton that defines it, executes repeatedly, each time you instantiate the subclass, on the one and only instance that exists for each subclass of Singleton.

v3 allows simple, cleaner coding for this example

We coded this example to work equally well in v2 and v3. In v3, you could code it more simply and cleanly by giving Singleton no superclasses, and calling super without arguments.

Attribute Reference Basics

An attribute reference is an expression of the form x.name, where x is any expression and name is an identifier called the attribute name. Many kinds of Python objects have attributes, but an attribute reference has special rich semantics when x refers to a class or instance. Remember that methods are attributes too, so everything we say about attributes in general also applies to attributes that are callable (i.e., methods).

Say that x is an instance of class C, which inherits from base class B. Both classes and the instance have several attributes (data and methods), as follows:

class B(object):
    a = 23
    b = 45
    def f(self): print('method f in class B')
    def g(self): print('method g in class B')
class C(B):
    b = 67
    c = 89
    d = 123
    def g(self): print('method g in class C')
    def h(self): print('method h in class C')
x = C()
x.d = 77
x.e = 88

A few attribute dunder-names are special. C.__name__ is the string 'C', the class’s name. C.__bases__ is the tuple (B,), the tuple of C’s base classes. x.__class__ is the class C, the class to which x belongs. When you refer to an attribute with one of these special names, the attribute reference looks directly into a dedicated slot in the class or instance object and fetches the value it finds there. You cannot unbind these attributes. You may rebind them on the fly, to change the name or base classes of a class, or to change the class of an instance, but this advanced technique is rarely necessary.

Both class C and instance x each have one other special attribute: a mapping named __dict__. All other attributes of a class or instance, except for the few special ones, are held as items in the __dict__ attribute of the class or instance.

Getting an attribute from a class

When you use the syntax C.name to refer to an attribute on a class object C, the lookup proceeds in two steps:

  1. When 'name' is a key in C.__dict__, C.name fetches the value v from C.__dict__['name']. Then, when v is a descriptor (i.e., type(v) supplies a method named __get__), the value of C.name is the result of calling type(v).__get__(v, None, C). When v is not a descriptor, the value of C.name is v.

  2. When 'name' is not a key in C.__dict__, C.name delegates the lookup to C’s base classes, meaning it loops on C’s ancestor classes and tries the name lookup on each (in method resolution order, as covered in “Method resolution order”).

Getting an attribute from an instance

When you use the syntax x.name to refer to an attribute of instance x of class C, the lookup proceeds in three steps:

  1. When 'name' is found in C (or in one of C’s ancestor classes) as the name of an overriding descriptor v (i.e., type(v) supplies methods __get__ and __set__)

    • The value of x.name is the result of type(v).__get__(v, x, C)

  2. Otherwise, when 'name' is a key in x.__dict__

    • x.name fetches and returns the value at x.__dict__['name']

  3. Otherwise, x.name delegates the lookup to x’s class (according to the same two-step lookup used for C.name, as just detailed)

    • When a descriptor v is found, the overall result of the attribute lookup is, again, type(v).__get__(v, x, C)

    • When a nondescriptor value v is found, the overall result of the attribute lookup is just v

When these lookup steps do not find an attribute, Python raises an AttributeError exception. However, for lookups of x.name, when C defines or inherits the special method __getattr__, Python calls C.__getattr__(x,'name') rather than raising the exception. It’s then up to __getattr__ to either return a suitable value or raise the appropriate exception, normally AttributeError.

Consider the following attribute references, defined previously:

print(x.e, x.d, x.c, x.b, x.a)           # prints: 88 77 89 67 23

x.e and x.d succeed in step 2 of the instance lookup process, since no descriptors are involved, and 'e' and 'd' are both keys in x.__dict__. Therefore, the lookups go no further, but rather return 88 and 77. The other three references must proceed to step 3 of the instance process and look in x.__class__ (i.e., C). x.c and x.b succeed in step 1 of the class lookup process, since 'c' and 'b' are both keys in C.__dict__. Therefore, the lookups go no further but rather return 89 and 67. x.a gets all the way to step 2 of the class process, looking in C.__bases__[0] (i.e., B). 'a' is a key in B.__dict__; therefore, x.a finally succeeds and returns 23.

Setting an attribute

Note that the attribute lookup steps happen as just described only when you refer to an attribute, not when you bind an attribute. When you bind (on either a class or an instance) an attribute whose name is not special (unless a __setattr__ method, or the __set__ method of an overriding descriptor, intercepts the binding of an instance attribute), you affect only the __dict__ entry for the attribute (in the class or instance, respectively). In other words, for attribute binding, there is no lookup procedure involved, except for the check for overriding descriptors.

Bound and Unbound Methods

The method __get__ of a function object can return an unbound method object (in v2) or the function object itself (in v3), or a bound method object that wraps the function. The key difference between unbound and bound methods is that an unbound method (v2 only) is not associated with a particular instance, while a bound method is.

In the code in the previous section, attributes f, g, and h are functions; therefore, an attribute reference to any one of them returns a method object that wraps the respective function. Consider the following:

print(x.h, x.g, x.f, C.h, C.g, C.f)

This statement outputs three bound methods, represented by strings like:

<bound method C.h of <__main__.C object at 0x8156d5c>>

and then, in v2, three unbound ones, represented by strings like:

<unbound method C.h>

or, in v3, three function objects, represented by strings like:

<function C.h at 0x102cabae8>

Bound versus unbound methods

We get bound methods when the attribute reference is on instance x, and unbound methods (in v3, function objects) when the attribute reference is on class C.

Because a bound method is already associated with a specific instance, you call the method as follows:

x.h()                      # prints: method h in class C

The key thing to notice here is that you don’t pass the method’s first argument, self, by the usual argument-passing syntax. Rather, a bound method of instance x implicitly binds the self parameter to object x. Thus, the body of the method can access the instance’s attributes as attributes of self, even though we don’t pass an explicit argument to the method.

An unbound method (v2 only), however, is not associated with a specific instance, so you must specify an appropriate instance as the first argument when you call an unbound method. For example:

C.h(x)                     # prints: method h in class C

You call unbound methods far less frequently than bound methods. One important use for unbound methods is to access overridden methods, as discussed in “Inheritance”; even for that task, though, it’s usually better to use the super built-in covered in “Cooperative superclass method calling”. Another frequent use of unbound methods is in higher-order functions; for example, to sort a list of strings alphabetically but case-insensitively, los.sort(key=str.lower) is fine.

Unbound method details (v2 only)

As we’ve just discussed, when an attribute reference on a class refers to a function, a reference to that attribute, in v2, returns an unbound method object that wraps the function. An unbound method has three attributes in addition to those of the function object it wraps: im_class is the class object supplying the method, im_func is the wrapped function, and im_self is always None. These attributes are all read-only, meaning that trying to rebind or unbind any of them raises an exception.

You can call an unbound method just as you would call its im_func function, but the first argument in any call must be an instance of im_class or a descendant. In other words, a call to an unbound method must have at least one argument, which corresponds to the wrapped function’s first formal parameter (conventionally named self).

Bound method details

When an attribute reference on an instance, in the course of the lookup, finds a function object that’s an attribute in the instance’s class, the lookup calls the function’s __get__ method to get the attribute’s value. The call, in this case, creates and returns a bound method that wraps the function.

Note that when the attribute reference’s lookup finds a function object in x.__dict__, the attribute reference operation does not create a bound method: in such cases Python does not treat the function as a descriptor, and does not call the function’s __get__ method; rather, the function object itself is the attribute’s value. Similarly, Python creates no bound method for callables that are not ordinary functions, such as built-in (as opposed to Python-coded) functions, since such callables are not descriptors.

A bound method is similar to an unbound method in that it has three read-only attributes in addition to those of the function object it wraps. Like in an unbound method, im_class is the class object that supplies the method, and im_func is the wrapped function. However, in a bound method object, attribute im_self refers to x, the instance from which you got the method.

You use a bound method just like its im_func function, but calls to a bound method do not explicitly supply an argument corresponding to the first formal parameter (conventionally named self). When you call a bound method, the bound method passes im_self as the first argument to im_func before other arguments (if any) given at the point of call.

Let’s follow in excruciating low-level detail the conceptual steps involved in a method call with the normal syntax x.name(arg). In the following context:

def f(a, b): ...             # a function f with two arguments

class C(object):
    name = f
x = C()

x is an instance object of class C, name is an identifier that names a method of x’s (an attribute of C whose value is a function, in this case function f), and arg is any expression. Python first checks if 'name' is the attribute name in C of an overriding descriptor, but it isn’t—functions are descriptors, because their type defines method __get__, but not overriding ones, because their type does not define method __set__. Python next checks if 'name' is a key in x.__dict__, but it isn’t. So Python finds name in C (everything would work in just the same way if name was found, by inheritance, in one of C’s __bases__). Python notices that the attribute’s value, function object f, is a descriptor. Therefore, Python calls f.__get__(x, C), which creates a bound method object with im_func set to f, im_class set to C, and im_self set to x. Then Python calls this bound method object, with arg as the only argument. The bound method inserts im_self (i.e., x) as the first argument, and arg becomes the second one, in a call to the bound method’s im_func (i.e., function f). The overall effect is just like calling:

x.__class__.__dict__['name'](x, arg)

When a bound method’s function body executes, it has no special namespace relationship to either its self object or any class. Variables referenced are local or global, just as for any other function, as covered in “Namespaces”. Variables do not implicitly indicate attributes in self, nor do they indicate attributes in any class object. When the method needs to refer to, bind, or unbind an attribute of its self object, it does so by standard attribute-reference syntax (e.g., self.name). The lack of implicit scoping may take some getting used to (simply because Python differs in this respect from many other object-oriented languages), but it results in clarity, simplicity, and the removal of potential ambiguities.

Bound method objects are first-class objects: you can use them wherever you can use a callable object. Since a bound method holds references to the function it wraps and to the self object on which it executes, it’s a powerful and flexible alternative to a closure (covered in “Nested functions and nested scopes”). An instance object whose class supplies the special method __call__ (covered in Table 4-1) offers another viable alternative. Each of these constructs lets you bundle some behavior (code) and some state (data) into a single callable object. Closures are simplest, but limited in their applicability. Here’s the closure from “Nested functions and nested scopes”:

def make_adder_as_closure(augend):
    def add(addend, _augend=augend): return addend+_augend
    return add

Bound methods and callable instances are richer and more flexible than closures. Here’s how to implement the same functionality with a bound method:

def make_adder_as_bound_method(augend):
    class Adder(object):
        def __init__(self, augend): self.augend = augend
        def add(self, addend): return addend+self.augend
    return Adder(augend).add

And here’s how to implement it with a callable instance (an instance whose class supplies the special method __call__):

def make_adder_as_callable_instance(augend):
    class Adder(object):
        def __init__(self, augend): self.augend = augend
        def __call__(self, addend): return addend+self.augend
    return Adder(augend)

From the viewpoint of the code that calls the functions, all of these factory functions are interchangeable, since all of them return callable objects that are polymorphic (i.e., usable in the same ways). In terms of implementation, the closure is simplest; the bound method and the callable instance use more flexible, general, and powerful mechanisms, but there is no need for that extra power in this simple example.

Inheritance

When you use an attribute reference C.name on a class object C, and 'name' is not a key in C.__dict__, the lookup implicitly proceeds on each class object that is in C.__bases__ in a specific order (which for historical reasons is known as the method resolution order, or MRO, but applies to all attributes, not just methods). C’s base classes may in turn have their own bases. The lookup checks direct and indirect ancestors, one by one, in MRO, stopping when 'name' is found.

Method resolution order

The lookup of an attribute name in a class essentially occurs by visiting ancestor classes in left-to-right, depth-first order. However, in the presence of multiple inheritance (which makes the inheritance graph a general Directed Acyclic Graph rather than specifically a tree), this simple approach might lead to some ancestor class being visited twice. In such cases, the resolution order leaves in the lookup sequence only the rightmost occurrence of any given class.

Each class and built-in type has a special read-only class attribute called __mro__, which is the tuple of types used for method resolution, in order. You can reference __mro__ only on classes, not on instances, and, since __mro__ is a read-only attribute, you cannot rebind or unbind it. For a detailed and highly technical explanation of all aspects of Python’s MRO, you may want to study an online essay by Michele Simionato, The Python 2.3 Method Resolution Order, and GvR’s history note at the Python History site.

Overriding attributes

As we’ve just seen, the search for an attribute proceeds along the MRO (typically, up the inheritance tree) and stops as soon as the attribute is found. Descendant classes are always examined before their ancestors, so that, when a subclass defines an attribute with the same name as one in a superclass, the search finds the definition in the subclass and stops there. This is known as the subclass overriding the definition in the superclass. Consider the following:

class B(object):
    a = 23
    b = 45
    def f(self): print('method f in class B')
    def g(self): print('method g in class B')
class C(B):
    b = 67
    c = 89
    d = 123
    def g(self): print('method g in class C')
    def h(self): print('method h in class C')

In this code, class C overrides attributes b and g of its superclass B. Note that, unlike in some other languages, in Python you may override data attributes just as easily as callable attributes (methods).

Delegating to superclass methods

When a subclass C overrides a method f of its superclass B, the body of C.f often wants to delegate some part of its operation to the superclass’s implementation of the method. This can sometimes be done using what, in v2, is an unbound method (in v3, it’s a function object, but works just the same way), as follows:

class Base(object):
    def greet(self, name): print('Welcome', name)
class Sub(Base):
    def greet(self, name):
        print('Well Met and', end=' ')
        Base.greet(self, name)
x = Sub()
x.greet('Alex')

The delegation to the superclass, in the body of Sub.greet, uses an unbound method (in v2; in v3, with this same syntax, it’s a function object, which works just the same way) obtained by attribute reference Base.greet on the superclass, and therefore passes all arguments normally, including self. Delegating to a superclass implementation is the most frequent use of unbound methods in v2.

One common use of delegation occurs with special method __init__. When Python creates an instance, the __init__ methods of base classes are not automatically called, as they are in some other object-oriented languages. Thus, it is up to a subclass to perform the proper initialization of superclasses, by using delegation, if necessary. For example:

class Base(object):
    def __init__(self):
        self.anattribute = 23
class Derived(Base):
    def __init__(self):
        Base.__init__(self)
        self.anotherattribute = 45

If the __init__ method of class Derived didn’t explicitly call that of class Base, instances of Derived would miss that portion of their initialization, and thus such instances would lack attribute anattribute. This issue does not arise if a subclass does not define __init__, since in that case it inherits it from the superclass. So there is never any reason to code:

class Derived(Base):
    def __init__(self):
        Base.__init__(self)

Never code a method that just delegates to the superclass

Never define a semantically empty __init__ (i.e., one that just delegates to the superclass): rather, just inherit __init__ from the superclass. This advice applies to all methods, special or not, but, for some reason, the bad habit of coding such semantically empty methods occurs most often for __init__.

Cooperative superclass method calling

Calling the superclass’s version of a method with unbound method syntax is quite problematic in cases of multiple inheritance with diamond-shaped graphs. Consider the following definitions:

class A(object):
    def met(self):
        print('A.met')
class B(A):
    def met(self):
        print('B.met')
        A.met(self)
class C(A):
    def met(self):
        print('C.met')
        A.met(self)
class D(B,C):
    def met(self):
        print('D.met')
        B.met(self)
        C.met(self)

In this code, when we call D().met(), A.met ends up being called twice. How can we ensure that each ancestor’s implementation of the method is called once and only once? The solution is to use built-in type super. In v2, this requires calling super(aclass, obj), which returns a special superobject of object obj. When we look up an attribute in this superobject, the lookup begins after class aclass in obj’s MRO. In v2, we can therefore rewrite the previous code as:

class A(object):
    def met(self):
        print('A.met')
class B(A):
    def met(self):
        print('B.met')
        super(B,self).met()
class C(A):
    def met(self):
        print('C.met')
        super(C,self).met()
class D(B,C):
    def met(self):
        print('D.met')
        super(D,self).met()

In v3, while the v2 syntax is still OK (and so the preceding snippet runs fine), super’s semantics have been strengthened so you can replace each of the calls to it with just super(), without arguments.

Now, D().met() results in exactly one call to each class’s version of met. If you get into the habit of always coding superclass calls with super, your classes fit smoothly even in complicated inheritance structures. There are no ill effects if the inheritance structure instead turns out to be simple.

The only situation in which you may prefer to use the rougher approach of calling superclass methods through the unbound-method syntax is when the various classes have different and incompatible signatures for the same method—an unpleasant situation in many respects, but, if you do have to deal with it, the unbound-method syntax may sometimes be the least of evils. Proper use of multiple inheritance is seriously hampered—but then, even the most fundamental properties of OOP, such as polymorphism between base and subclass instances, are seriously impaired when you give methods of the same name different and incompatible signatures in the superclass and subclass.

“Deleting” class attributes

Inheritance and overriding provide a simple and effective way to add or modify (override) class attributes (such as methods) noninvasively (i.e., without modifying the base class defining the attributes) by adding or overriding the attributes in subclasses. However, inheritance does not offer a way to delete (hide) base classes’ attributes noninvasively. If the subclass simply fails to define (override) an attribute, Python finds the base class’s definition. If you need to perform such deletion, possibilities include:

  • Override the method and raise an exception in the method’s body.

  • Eschew inheritance, hold the attributes elsewhere than in the subclass’s __dict__, and define __getattr__ for selective delegation.

  • Override __getattribute__ to similar effect.

The last of these techniques is shown in “__getattribute__”.

The Built-in object Type

The built-in object type is the ancestor of all built-in types and new-style classes. The object type defines some special methods (documented in “Special Methods”) that implement the default semantics of objects:

__new__ __init__

You can create a direct instance of object by calling object() without any arguments. The call implicitly uses object.__new__ and object.__init__ to make and return an instance object without attributes (and without even a __dict__ in which to hold attributes). Such instance objects may be useful as “sentinels,” guaranteed to compare unequal to any other distinct object.

__delattr__ __getattribute__ __setattr__

By default, any object handles attribute references (as covered in “Attribute Reference Basics”) using these methods of object.

__hash__ __repr__ __str__

Any object can be passed to the functions hash and repr and to the type str.

A subclass of object may override any of these methods and/or add others.

Class-Level Methods

Python supplies two built-in nonoverriding descriptor types, which give a class two distinct kinds of “class-level methods”: static methods and class methods.

Static methods

A static method is a method that you can call on a class, or on any instance of the class, without the special behavior and constraints of ordinary methods, bound or unbound, with regard to the first parameter. A static method may have any signature; it may have no parameters, and the first parameter, if any, plays no special role. You can think of a static method as an ordinary function that you’re able to call normally, despite the fact that it happens to be bound to a class attribute.

While it is never necessary to define static methods (you can always choose to instead define a normal function, outside the class), some programmers consider them to be an elegant syntax alternative when a function’s purpose is tightly bound to some specific class.

To build a static method, call the built-in type staticmethod and bind its result to a class attribute. Like all binding of class attributes, this is normally done in the body of the class, but you may also choose to perform it elsewhere. The only argument to staticmethod is the function to call when Python calls the static method. The following example shows one way to define and call a static method:

class AClass(object):
    def astatic(): print('a static method')
    astatic = staticmethod(astatic)
an_instance = AClass()
AClass.astatic()                    # prints: a static method
an_instance.astatic()               # prints: a static method

This example uses the same name for the function passed to staticmethod and for the attribute bound to staticmethod’s result. This naming is not mandatory, but it’s a good idea, and we recommend you always use it. Python also offers a special, simplified syntax to support this style, covered in “Decorators”.

Class methods

A class method is a method you can call on a class or on any instance of the class. Python binds the method’s first parameter to the class on which you call the method, or the class of the instance on which you call the method; it does not bind it to the instance, as for normal bound methods. The first parameter of a class method is conventionally named cls.

While it is never necessary to define class methods (you can always choose to instead define a normal function, outside the class, that takes the class object as its first parameter), class methods are an elegant alternative to such functions (particularly since they can usefully be overridden in subclasses, when that is necessary).

To build a class method, call the built-in type classmethod and bind its result to a class attribute. Like all binding of class attributes, this is normally done in the body of the class, but you may choose to perform it elsewhere. The only argument to classmethod is the function to call when Python calls the class method. Here’s one way you can define and call a class method:

class ABase(object):
    def aclassmet(cls): print('a class method for', cls.__name__)
    aclassmet = classmethod(aclassmet)
class ADeriv(ABase): pass
b_instance = ABase()
d_instance = ADeriv()
ABase.aclassmet()               # prints: a class method for ABase
b_instance.aclassmet()          # prints: a class method for ABase
ADeriv.aclassmet()              # prints: a class method for ADeriv
d_instance.aclassmet()          # prints: a class method for ADeriv

This example uses the same name for the function passed to classmethod and for the attribute bound to classmethod’s result. This naming is not mandatory, but it’s a good idea, and we recommend that you always use it. Python offers a special, simplified syntax to support this style, covered in “Decorators”.

Properties

Python supplies a built-in overriding descriptor type, which you may use to give a class’s instances properties.

A property is an instance attribute with special functionality. You reference, bind, or unbind the attribute with the normal syntax (e.g., print(x.prop), x.prop=23, del x.prop). However, rather than following the usual semantics for attribute reference, binding, and unbinding, these accesses call on instance x the methods that you specify as arguments to the built-in type property. Here’s one way to define a read-only property:

class Rectangle(object):
    def __init__(self, width, height):
        self.width = width
        self.height = height
    def get_area(self):
        return self.width * self.height
    area = property(get_area, doc='area of the rectangle')

Each instance r of class Rectangle has a synthetic read-only attribute r.area, computed on the fly in method r.get_area() by multiplying the sides. The docstring Rectangle.area.__doc__ is 'area of the rectangle'. Attribute r.area is read-only (attempts to rebind or unbind it fail) because we specify only a get method in the call to property, no set or del methods.

Properties perform tasks similar to those of special methods __getattr__, __setattr__, and __delattr__ (covered in “General-Purpose Special Methods”), but are faster and simpler. To build a property, call the built-in type property and bind its result to a class attribute. Like all binding of class attributes, this is normally done in the body of the class, but you may choose to do it elsewhere. Within the body of a class C, you can use the following syntax:

attrib = property(fget=None, fset=None, fdel=None, doc=None)

When x is an instance of C and you reference x.attrib, Python calls on x the method you passed as argument fget to the property constructor, without arguments. When you assign x.attrib = value, Python calls the method you passed as argument fset, with value as the only argument. When you execute del x.attrib, Python calls the method you passed as argument fdel, without arguments. Python uses the argument you passed as doc as the docstring of the attribute. All parameters to property are optional. When an argument is missing, the corresponding operation is forbidden (Python raises an exception when some code attempts that operation). For example, in the Rectangle example, we made property area read-only, because we passed an argument only for parameter fget, and not for parameters fset and fdel.

A more elegant syntax to create properties in a class is to use property as a decorator (see “Decorators”):

class Rectangle(object):
    def __init__(self, width, height):
        self.width = width
        self.height = height
    @property
    def area(self):
        '''area of the rectangle'''
        return self.width * self.height

To use this syntax, you must give the getter method the same name as you want the property to have; the method’s docstring becomes the docstring of the property. If you want to add a setter and/or a deleter as well, use decorators named (in this example) area.setter and area.deleter, and name the methods thus decorated the same as the property, too. For example:

class Rectangle(object):
    def __init__(self, width, height):
        self.width = width
        self.height = height
    @property
    def area(self):
        '''area of the rectangle'''
        return self.width * self.height
    @area.setter
    def area(self, value):
        scale = math.sqrt(value/self.area)
        self.width *= scale
        self.height *= scale

Why properties are important

The crucial importance of properties is that their existence makes it perfectly safe (and indeed advisable) for you to expose public data attributes as part of your class’s public interface. Should it ever become necessary, in future versions of your class or other classes that need to be polymorphic to it, to have some code executed when the attribute is referenced, rebound, or unbound, you know you will be able to change the plain attribute into a property and get the desired effect without any impact on any other code that uses your class (AKA “client code”). This lets you avoid goofy idioms, such as accessor and mutator methods, required by OO languages that lack properties or equivalent machinery. For example, client code can simply use natural idioms such as:

some_instance.widget_count += 1

rather than being forced into contorted nests of accessors and mutators such as:

some_instance.set_widget_count(some_instance.get_widget_count() + 1)

If you’re ever tempted to code methods whose natural names are something like get_this or set_that, wrap those methods into properties instead, for clarity.

Properties and inheritance

Inheritance of properties is just like for any other attribute. However, there’s a little trap for the unwary: the methods called upon to access a property are those defined in the class in which the property itself is defined, without intrinsic use of further overriding that may happen in subclasses. For example:

class B(object):
  def f(self): return 23
  g = property(f)
class C(B):
  def f(self): return 42
c = C()
print(c.g)                # prints: 23, not 42

Accessing property c.g calls B.f, not C.f as you might expect. The reason is quite simple: the property constructor receives (directly or via the decorator syntax) the function object f (and that happens at the time the class statement for B executes, so the function object in question is the one also known as B.f). The fact that the subclass C later redefines name f is therefore irrelevant, since the property performs no lookup for that name, but rather uses the function object it was passed at creation time. If you need to work around this issue, you can always do it by adding the extra level of lookup indirection yourself:

class B(object):
  def f(self): return 23
  def _f_getter(self): return self.f()
  g = property(_f_getter)
class C(B):
  def f(self): return 42
c = C()
print(c.g)                # prints: 42, as expected

Here, the function object held by the property is B._f_getter, which in turn does perform a lookup for name f (since it calls self.f()); therefore, the overriding of f has the expected effect. After all, as David Wheeler famously put it, “All problems in computer science can be solved by another level of indirection.”1

__slots__

Normally, each instance object x of any class C has a dictionary x.__dict__ that Python uses to let you bind arbitrary attributes on x. To save a little memory (at the cost of letting x have only a predefined set of attribute names), you can define in class C a class attribute named __slots__, a sequence (normally a tuple) of strings (normally identifiers). When class C has an attribute __slots__, a direct instance x of class C has no x.__dict__, and any attempt to bind on x any attribute whose name is not in C.__slots__ raises an exception.

Using __slots__ lets you reduce memory consumption for small instance objects that can do without the powerful and convenient ability to have arbitrarily named attributes. __slots__ is worth adding only to classes that can have so many instances that saving a few tens of bytes per instance is important—typically classes that could have millions, not mere thousands, of instances alive at the same time. Unlike most other class attributes, __slots__ works as we’ve just described only if an assignment in the class body binds it as a class attribute. Any later alteration, rebinding, or unbinding of __slots__ has no effect, nor does inheriting __slots__ from a base class. Here’s how to add __slots__ to the Rectangle class defined earlier to get smaller (though less flexible) instances:

class OptimizedRectangle(Rectangle):
    __slots__ = 'width', 'height'

We do not need to define a slot for the area property. __slots__ does not constrain properties, only ordinary instance attributes, which would reside in the instance’s __dict__ if __slots__ wasn’t defined.

__getattribute__

All references to instance attributes go through special method __getattribute__. This method comes from object, where it implements the details of object attribute reference semantics documented in “Attribute Reference Basics”. However, you may override __getattribute__ for special purposes, such as hiding inherited class attributes for a subclass’s instances. The following example shows one way to implement a list without append:

class listNoAppend(list):
    def __getattribute__(self, name):
        if name == 'append': raise AttributeError(name)
        return list.__getattribute__(self, name)

An instance x of class listNoAppend is almost indistinguishable from a built-in list object, except that performance is substantially worse, and any reference to x.append raises an exception.

Per-Instance Methods

An instance can have instance-specific bindings for all attributes, including callable attributes (methods). For a method, just like for any other attribute (except those bound to overriding descriptors), an instance-specific binding hides a class-level binding: attribute lookup does not consider the class when it finds a binding directly in the instance. An instance-specific binding for a callable attribute does not perform any of the transformations detailed in “Bound and Unbound Methods”: the attribute reference returns exactly the same callable object that was earlier bound directly to the instance attribute.

However, this does not work as you might expect for per-instance bindings of the special methods that Python calls implicitly as a result of various operations, as covered in “Special Methods”. Such implicit uses of special methods always rely on the class-level binding of the special method, if any. For example:

def fake_get_item(idx): return idx
class MyClass(object): pass
n = MyClass()
n.__getitem__ = fake_get_item
print(n[23])                      # results in:
# Traceback (most recent call last):
#   File "<stdin>", line 1, in ?
# TypeError: unindexable object

Inheritance from Built-in Types

A class can inherit from a built-in type. However, a class may directly or indirectly extend multiple built-in types only if those types are specifically designed to allow this level of mutual compatibility. Python does not support unconstrained inheritance from multiple arbitrary built-in types. Normally, a new-style class only extends at most one substantial built-in type—this means at most one built-in type in addition to object, which is the superclass of all built-in types and classes and imposes no constraints on multiple inheritance. For example:

class noway(dict, list): pass

raises a TypeError exception, with a detailed explanation of “multiple bases have instance lay-out conflict.” If you ever see such error messages, it means that you’re trying to inherit, directly or indirectly, from multiple built-in types that are not specifically designed to cooperate at such a deep level.

Special Methods

A class may define or inherit special methods (i.e., methods whose names begin and end with double underscores, AKA “dunder” or “magic” methods). Each special method relates to a specific operation. Python implicitly calls a special method whenever you perform the related operation on an instance object. In most cases, the method’s return value is the operation’s result, and attempting an operation when its related method is not present raises an exception.

Throughout this section, we point out the cases in which these general rules do not apply. In the following, x is the instance of class C on which you perform the operation, and y is the other operand, if any. The parameter self of each method also refers to the instance object x. In the following sections, whenever we mention calls to x.__whatever__(...), keep in mind that the exact call happening is rather, pedantically speaking, x.__class__.__whatever__(x, ...).

General-Purpose Special Methods

Some special methods relate to general-purpose operations. A class that defines or inherits these methods allows its instances to control such operations. These operations can be divided into the following categories:

Initialization and finalization

A class can control its instances’ initialization (a very common requirement) via the special methods __new__ and __init__, and/or their finalization (a rare requirement) via __del__.

Representation as string

A class can control how Python renders its instances as strings via special methods __repr__, __str__, __format__, (v3 only) __bytes__, and (v2 only) __unicode__.

Comparison, hashing, and use in a Boolean context

A class can control how its instances compare with other objects (methods __lt__, __le__, __gt__, __ge__, __eq__, __ne__), how dictionaries use them as keys and sets use them as members (__hash__), and whether they evaluate to true or false in Boolean contexts (__nonzero__ in v2, __bool__ in v3).

Attribute reference, binding, and unbinding

A class can control access to its instances’ attributes (reference, binding, unbinding) via special methods __getattribute__, __getattr__, __setattr__, and __delattr__.

Callable instances

An instance is callable, just like a function object, if its class has the special method __call__.

Table 4-1 documents the general-purpose special methods.

Table 4-1. General-purpose special methods

__bytes__

__bytes__(self)

In v3, calling bytes(x) calls x.__bytes__(), if present. If a class supplies both special methods __bytes__ and __str__, the two should return equivalent strings (of bytes and text type, respectively).

__call__

__call__(self[,args...])

When you call x([args...]), Python translates the operation into a call to x.__call__([args...]). The arguments for the call operation correspond to the parameters for the __call__ method, minus the first. The first parameter, conventionally called self, refers to x, and Python supplies it implicitly and automatically, just as in any other call to a bound method.

__dir__

__dir__(self)

When you call dir(x), Python translates the operation into a call to x.__dir__(), which must return a sorted list of x’s attributes. If x’s class does not have a __dir__, then dir(x) does its own introspection to return a list of x’s attributes, striving to produce relevant, rather than complete, information.

__del__

__del__(self)

Just before x disappears because of garbage collection, Python calls x.__del__() to let x finalize itself. If __del__ is absent, Python performs no special finalization upon garbage-collecting x (this is the usual case: very few classes need to define __del__). Python ignores the return value of __del__ and performs no implicit call to __del__ methods of class C’s superclasses. C.__del__ must explicitly perform any needed finalization, including, if need be, by delegation. For example, when class C has a base class B to finalize, the code in C.__del__ must call super(C, self).__del__() (or, in v3, just super().__del__()).

Note that the __del__ method has no direct connection with the del statement, as covered in “del Statements”.

__del__ is generally not the best approach when you need timely and guaranteed finalization. For such needs, use the try/finally statement covered in “try/finally” (or, even better, the with statement, covered in “The with Statement”). Instances of classes defining __del__ cannot participate in cyclic-garbage collection, covered in “Garbage Collection”. Therefore, you should be particularly careful to avoid reference loops involving such instances, and define __del__ only when there is no feasible alternative.

__delattr__

__delattr__(self, name)

At every request to unbind attribute x.y (typically, a del statement del x.y), Python calls x.__delattr__('y'). All the considerations discussed later for __setattr__ also apply to __delattr__. Python ignores the return value of __delattr__. If __delattr__ is absent, Python translates del x.y into del x.__dict__['y'].

__eq__, __ge__, __gt__, __le__, __lt__, __ne__

__eq__(self, other) __ge__(self, other)
__gt__(self,
other) __le__(self, other)
__lt__(self,
other) __ne__(self, other)

The comparisons x==y, x>=y, x>y, x<=y, x<y, and x!=y, respectively, call the special methods listed here, which should return False or True. Each method may return NotImplemented to tell Python to handle the comparison in alternative ways (e.g., Python may then try y>x in lieu of x<y).

Best practice is to define only one inequality comparison method (normally __lt__) plus __eq__, and decorate the class with functools.total_ordering (covered in Table 7-4) to avoid boilerplate, and any risk of logical contradictions in your comparisons.

__getattr__

__getattr__(self, name)

When the attribute x.y can’t be found by the usual steps (i.e., when AttributeError would normally be raised), Python calls x.__getattr__('y') instead. Python does not call __getattr__ for attributes found by normal means (i.e., as keys in x.__dict__, or via x.__class__). If you want Python to call __getattr__ on every attribute reference, keep the attributes elsewhere (e.g., in another dictionary referenced by an attribute with a private name), or else override __getattribute__ instead. __getattr__ should raise AttributeError if it cannot find y.

__getattribute__

__getattribute__(self, name)

At every request to access attribute x.y, Python calls x.__getattribute__('y'), which must get and return the attribute value or else raise AttributeError. The normal semantics of attribute access (using x.__dict__, C.__slots__, C’s class attributes, x.__getattr__) are all due to object.__getattribute__.

When class C overrides __getattribute__, it must implement all of the attribute access semantics it wants to offer. Most often, the most convenient way to implement attribute access semantics is by delegating (e.g., calling object.__getattribute__(self, ...) as part of the operation of your override of __getattribute__).

Overriding __getattribute__ slows attribute access

When a class overrides __getattribute__, attribute accesses on instances of the class become slow, since the overriding code executes on every such attribute access.

__hash__

__hash__(self)

Calling hash(x) calls x.__hash__() (and so do other contexts that need to know x’s hash value, namely, using x as a dictionary key, such as D[x] where D is a dictionary, or using x as a set member). __hash__ must return an int such that x==y implies hash(x)==hash(y), and must always return the same value for a given object.

When __hash__ is absent, calling hash(x) calls id(x) instead, as long as __eq__ is also absent. Other contexts that need to know x’s hash value behave the same way.

Any x such that hash(x) returns a result, rather than raising an exception, is known as a hashable object. When __hash__ is absent, but __eq__ is present, calling hash(x) raises an exception (and so do other contexts that need to know x’s hash value). In this case, x is not hashable and therefore cannot be a dictionary key or set member.

You normally define __hash__ only for immutable objects that also define __eq__. Note that if there exists any y such that x==y, even if y is of a different type, and both x and y are hashable, you must ensure that hash(x)==hash(y).

__init__

__init__(self[,args...])

When a call C([args...]) creates instance x of class C, Python calls x.__init__([args...]) to let x initialize itself. If __init__ is absent (i.e., it’s inherited from object), you must call class C without arguments, C(), and x has no instance-specific attributes upon creation. Python performs no implicit call to __init__ methods of class C’s superclasses. C.__init__ must explicitly perform any needed initialization, including, if need be, by delegation. For example, when class C has a base class B to initialize without arguments, the code in C.__init__ must explicitly call super(C, self).__init__() (or, in v3, just super().__init__()). However, __init__’s inheritance works just like for any other method or attribute: that is, if class C itself does not override __init__, it just inherits it from the first superclass in its __mro__ to override __init__, like for every other attribute.

__init__ must return None; otherwise, calling the class raises a TypeError.

__new__

__new__(cls[,args...])

When you call C([args...]), Python gets the new instance x that you are creating by invoking C.__new__(C,[args...]). Every class has the class method __new__ (most often simply inheriting it from object), which can return any value x. In other words, __new__ is not constrained to return a new instance of C, although normally it’s expected to do so. If, and only if, the value x that __new__ returns is indeed an instance of C or of any subclass of C (whether a new or previously existing one), Python continues after calling __new__ by implicitly calling __init__ on x (with the same [args...] that were originally passed to __new__).

Initialize immutables in __new__, all others in __init__

Since you could perform most kinds of initialization of new instances in either __init__ or __new__, you may wonder where best to place them. Simple: put the initialization in __init__ only, unless you have a specific reason to put it in __new__. (If a type is immutable, its instances cannot be changed in __init__ for initialization purposes, so this is a special case in which __new__ does have to perform all initialization.) This tip makes life simpler, since __init__ is an instance method, while __new__ is a specialized class method.

__nonzero__

__nonzero__(self)

When evaluating x as true or false (see “Boolean Values”)—for example, on a call to bool(x)—v2 calls x.__nonzero__(), which should return True or False. When __nonzero__ is not present, Python calls __len__ instead, and takes x as false when x.__len__() returns 0 (so, to check if a container is nonempty, avoid coding if len(container)>0:; just code if container: instead). When neither __nonzero__ nor __len__ is present, Python always considers x true.

In v3, this special method is spelled, more readably, as __bool__.

__repr__

__repr__(self)

Calling repr(x) (which also happens implicitly in the interactive interpreter when x is the result of an expression statement) calls x.__repr__() to get and return a complete string representation of x. If __repr__ is absent, Python uses a default string representation. __repr__ should return a string with unambiguous information on x. Ideally, when feasible, the string should be an expression such that eval(repr(x))==x (but don’t go crazy aiming for that goal).

__setattr__

__setattr__(self, name, value)

At every request to bind attribute x.y (typically, an assignment statement x.y=value, but also, for example, setattr(x, 'y', value)), Python calls x.__setattr__('y', value). Python always calls __setattr__ for any attribute binding on x—a major difference from __getattr__ (__setattr__ is closer to __getattribute__ in this sense). To avoid recursion, when x.__setattr__ binds x’s attributes, it must modify x.__dict__ directly (e.g., via x.__dict__[name]=value); even better, __setattr__ can delegate the setting to the superclass (by calling super(C, x).__setattr__('y', value) or, in v3, just super().__setattr__('y', value)). Python ignores the return value of __setattr__. If __setattr__ is absent (i.e., inherited from object), and C.y is not an overriding descriptor, Python usually translates x.y=z into x.__dict__['y']=z.

__str__

__str__(self)

The str(x) built-in type and the print(x) function call x.__str__() to get an informal, concise string representation of x. If __str__ is absent, Python calls x.__repr__ instead. __str__ should return a conveniently human-readable string, even if it entails some approximation.

__unicode__

__unicode__(self)

In v2, calling unicode(x) calls x.__unicode__(), if present, in preference to x.__str__(). If a class supplies both special methods __unicode__ and __str__, the two should return equivalent strings (of Unicode and plain-string type, respectively).

__format__

__format__(self, format_string='')

Calling format(x) calls x.__format__(), and calling format(x, format_string) calls x.__format__(format_string). The class is responsible for interpreting the format string (each class may define its own small “language” of format specifications, inspired by those implemented by built-in types as covered in “String Formatting”). If __format__ is inherited from object, it delegates to __str__ and does not accept a nonempty format string.

Special Methods for Containers

An instance can be a container (a sequence, mapping, or set—mutually exclusive concepts2). For maximum usefulness, containers should provide special methods __getitem__, __setitem__, __delitem__, __len__, __contains__, and __iter__, plus nonspecial methods discussed in the following sections. In many cases, suitable implementations of the nonspecial methods can be had by extending the appropriate abstract base class, from module collections, such as Sequence, MutableSequence, and so on, as covered in “Abstract Base Classes”.

Sequences

In each item-access special method, a sequence that has L items should accept any integer key such that  -L<=key<L.3 For compatibility with built-in sequences, a negative index key, 0>key>=-L, should be equivalent to key+L. When key has an invalid type, indexing should raise TypeError. When key is a value of a valid type but out of range, indexing should raise IndexError. For sequence classes that do not define __iter__, the for statement relies on these requirements, as do built-in functions that take iterable arguments. Every item-access special method of a sequence should also, if at all practical, accept as its index argument an instance of the built-in type slice whose start, step, and stop attributes are ints or None; the slicing syntax relies on this requirement, as covered in “Container slicing”.

A sequence should also allow concatenation (with another sequence of the same type) by +, and repetition by * (multiplication by an integer). A sequence should therefore have special methods __add__, __mul__, __radd__, and __rmul__, covered in “Special Methods for Numeric Objects”; mutable sequences should also have equivalent in-place methods __iadd__ and __imul__. A sequence should be meaningfully comparable to another sequence of the same type, implementing lexicographic comparison like lists and tuples do. (Inheriting from ABCs Sequence or MutableSequence, alas, does not suffice to fulfill these requirements; such inheritance only supplies __iadd__.)

Every sequence should have the nonspecial methods covered in “List methods”: count and index in any case, and, if mutable, then also append, insert, extend, pop, remove, reverse, and sort, with the same signatures and semantics as the corresponding methods of lists. (Inheriting from ABCs Sequence or MutableSequence does suffice to fulfill these requirements, except for sort.)

An immutable sequence should be hashable if, and only if, all of its items are. A sequence type may constrain its items in some ways (for example, accepting only string items), but that is not mandatory.

Mappings

A mapping’s item-access special methods should raise KeyError, rather than IndexError, when they receive an invalid key argument value of a valid type. Any mapping should define the nonspecial methods covered in “Dictionary Methods”: copy, get, items, keys, values, and, in v2, iteritems, iterkeys, and itervalues. In v2, special method __iter__ should be equivalent to iterkeys (in v3, it should be equivalent to keys, which, in v3, has the semantics iterkeys has in v2). A mutable mapping should also define methods clear, pop, popitem, setdefault, and update. (Inheriting from ABCs Mapping or MutableMapping does fulfill these requirements, except for copy.)

An immutable mapping should be hashable if all of its items are. A mapping type may constrain its keys in some ways (for example, accepting only hashable keys, or, even more specifically, accepting, say, only string keys), but that is not mandatory. Any mapping should be meaningfully comparable to another mapping of the same type (at least for equality and inequality; not necessarily for ordering comparisons).

Sets

Sets are a peculiar kind of container—containers that are neither sequences nor mappings, and cannot be indexed, but do have a length (number of elements) and are iterable. Sets also support many operators (&, |, ^, -, as well as membership tests and comparisons) and equivalent nonspecial methods (intersection, union, and so on). If you implement a set-like container, it should be polymorphic to Python built-in sets, covered in “Sets”. (Inheriting from ABCs Set or MutableSet does fulfill these requirements.)

An immutable set-like type should be hashable if all of its elements are. A set-like type may constrain its elements in some ways (for example, accepting only hashable elements, or, even more specifically, accepting, say, only integer elements), but that is not mandatory.

Container slicing

When you reference, bind, or unbind a slicing such as x[i:j] or x[i:j:k] on a container x (in practice, this is only used with sequences), Python calls x’s applicable item-access special method, passing as key an object of a built-in type called a slice object. A slice object has the attributes start, stop, and step. Each attribute is None if you omit the corresponding value in the slice syntax. For example, del x[:3] calls x.__delitem__(y), where y is a slice object such that y.stop is 3, y.start is None, and y.step is None. It is up to container object x to appropriately interpret slice object arguments passed to x’s special methods. The method indices of slice objects can help: call it with your container’s length as its only argument, and it returns a tuple of three nonnegative indices suitable as start, stop, and step for a loop indexing each item in the slice. A common idiom in a sequence class’s __getitem__ special method, to fully support slicing, is, for example:

def __getitem__(self, index):
  # Recursively specialcase slicing
  if isinstance(index, slice):
    return self.__class__(self[x]
                          for x in range(*self.indices(len(self))))
  # Check index, dealing with negative indices too
  if not isinstance(index, numbers.Integral): raise TypeError
  if index < 0: index += len(self)
  if not (0 <= index < len(self)): raise IndexError
  # Index is now a correct integral number, within range(len(self))
  ...rest of __getitem__, dealing with single-item access...

This idiom uses generator-expression (genexp) syntax and assumes that your class’s __init__ method can be called with an iterable argument to create a suitable new instance of the class.

Container methods

The special methods __getitem__, __setitem__, __delitem__, __iter__, __len__, and __contains__ expose container functionality (see Table 4-2).

Table 4-2. Container methods

__contains__

__contains__(self,item)

The Boolean test y in x calls x.__contains__(y). When x is a sequence, or set-like, __contains__ should return True when y equals the value of an item in x. When x is a mapping, __contains__ should return True when y equals the value of a key in x. Otherwise, __contains__ should return False. When __contains__ is absent, Python performs y in x as follows, taking time proportional to len(x):

for z in x:
    if y==z: return True
return False

__delitem__

__delitem__(self,key)

For a request to unbind an item or slice of x (typically del x[key]), Python calls x.__delitem__(key). A container x should have __delitem__ only if x is mutable so that items (and possibly slices) can be removed.

__getitem__

__getitem__(self,key)

When you access x[key] (i.e., when you index or slice container x), Python calls x.__getitem__(key). All (non-set-like) containers should have __getitem__.

__iter__

__iter__(self)

For a request to loop on all items of x (typically for item in x), Python calls x.__iter__() to get an iterator on x. The built-in function iter(x) also calls x.__iter__(). When __iter__ is absent, iter(x) synthesizes and returns an iterator object that wraps x and yields x[0], x[1], and so on, until one of these indexings raises IndexError to indicate the end of the container. However, it is best to ensure that all of the container classes you code have __iter__.

__len__

__len__(self)

Calling len(x) calls x.__len__() (and so do other built-in functions that need to know how many items are in container x). __len__ should return an int, the number of items in x. Python also calls x.__len__() to evaluate x in a Boolean context, when __nonzero__ (__bool__ in v3) is absent; in this case, a container is taken as false if and only if the container is empty (i.e., the container’s length is 0). All containers should have __len__, unless it’s just too expensive for the container to determine how many items it contains.

__setitem__

__setitem__(self,key,value)

For a request to bind an item or slice of x (typically an assignment x[key]=value), Python calls x.__setitem__(key,value). A container x should have __setitem__ only if x is mutable so that items, and possibly slices, can be added and/or rebound.

Abstract Base Classes

Abstract base classes (ABCs) are an important pattern in object-oriented (OO) design: they’re classes that cannot be directly instantiated, but exist only to be extended by concrete classes (the more usual kind of classes, the ones that can be instantiated).

One recommended approach to OO design is to never extend a concrete class: if two concrete classes have so much in common that you’re tempted to have one of them inherit from the other, proceed instead by making an abstract base class that subsumes all they do have in common, and have each concrete class extend that ABC. This approach avoids many of the subtle traps and pitfalls of inheritance.

Python offers rich support for ABCs, enough to make them a first-class part of Python’s object model.

abc

The standard library module abc supplies metaclass ABCMeta and, in v3, class ABC (subclassing ABC makes ABCMeta the metaclass, and has no other effect).

When you use abc.ABCMeta as the metaclass for any class C, this makes C an ABC, and supplies the class method C.register, callable with a single argument: that argument can be any existing class (or built-in type) X.

Calling C.register(X) makes X a virtual subclass of C, meaning that issubclass(X,C) returns True, but C does not appear in X.__mro__, nor does X inherit any of C’s methods or other attributes.

Of course, it’s also possible to have a new class Y inherit from C in the normal way, in which case C does appear in Y.__mro__, and Y inherits all of C’s methods, as usual in subclassing.

An ABC C can also optionally override class method __subclasshook__, which issubclass(X,C) calls with the single argument X, X being any class or type. When C.__subclasshook__(X) returns True, then so does issubclass(X,C); when C.__subclasshook__(X) returns False, then so does issubclass(X,C); when C.__subclasshook__(X) returns NotImplemented, then issubclass(X,C) proceeds in the usual way.

The module abc also supplies the decorator abstractmethod (and abstractproperty, but the latter is deprecated in v3, where you can just apply both the abstractmethod and property decorators to get the same effect). Abstract methods and properties can have implementations (available to subclasses via the super built-in)—however, the point of making methods and properties abstract is that you can instantiate any nonvirtual subclass X of an ABC C only if X overrides every abstract property and method of C.

ABCs in the collections module

collections supplies many ABCs. Since Python 3.4, the ABCs are in collections.abc (but, for backward compatibility, can still be accessed directly in collections itself: the latter access will cease working in some future release of v3).

Some just characterize any class defining or inheriting a specific abstract method, as listed in Table 4-3:

Table 4-3.  
Callable Any class with __call__
Container Any class with __contains__
Hashable Any class with __hash__
Iterable Any class with __iter__
Sized Any class with __len__

The other ABCs in collections extend one or more of the preceding ones, add more abstract methods, and supply mixin methods implemented in terms of the abstract methods (when you extend any ABC in a concrete class, you must override the abstract methods; you can optionally override some or all of the mixin methods, if that helps improve performance, but you don’t have to—you can just inherit them, if this results in performance that’s sufficient for your purposes).

Here is the set of ABCs directly extending the preceding ones:

ABC Extends Abstract methods Mixin methods
Iterator Iterable __next__ (in v2, next) __iter__
Mapping Container, Iterable, Sized __getitem__, __iter__, __len__ __contains__, __eq__, __ne__, get, items, keys, values
MappingView Sized   __len__
Sequence Container, Iterable, Sized __getitem__, __len__ __contains__, __iter__, __reversed__, count, index
Set Container, Iterable, Sized __contains__, __iter, __len__ __and__, __eq__, __ge__, __gt__, __le__, __lt__, __ne__, __or__, __sub__, __xor__, isdisjoint

And lastly, the set of ABCs further extending the previous ones:

ABC Extends Abstract methods Mixin methods
ItemsView MappingView, Set   __contains__, __iter__
KeysView MappingView, Set   __contains__, __iter__
MutableMapping Mapping __delitem__, __getitem__, __iter__, __len__, __setitem__ Mapping’s methods, plus clear, pop, popitem, setdefault, update
MutableSequence Sequence __delitem__, __getitem__, __len__, __setitem__, insert Sequence’s methods, plus __iadd__, append, extend, pop, remove, reverse
MutableSet Set __contains__, __iter, __len__, add, discard Set’s methods, plus __iand__, __ior__, __isub__, __ixor__, clear, pop, remove
ValuesView MappingView   __contains__, __iter__

See the online docs for further details and usage examples.

The numbers module

numbers supplies a hierarchy (also known as a tower) of ABCs representing various kinds of numbers. numbers supplies the following ABCs:

Number The root of the hierarchy: numbers of any kind (need not support any given operation)
Complex Extends Number; must support (via the appropriate special methods) conversions to complex and bool, +, -, *, /, ==, !=, abs(); and, directly, the method conjugate() and properties real and imag
Real Extends Complex; additionally, must support (via the appropriate special methods) conversion to float, math.trunc(), round(), math.floor(), math.ceil(), divmod(), //, %, <, <=, >, >=
Rational Extends Real; additionally, must support the properties numerator and denominator
Integral Extends Rational; additionally, must support (via the appropriate special methods) conversion to int, **, and bitwise operations <<, >>, &, ^, |, ~

See the online docs for notes on implementing your own numeric types.

Special Methods for Numeric Objects

An instance may support numeric operations by means of many special methods. Some classes that are not numbers also support some of the special methods in Table 4-4 in order to overload operators such as + and *. In particular, sequences should have special methods __add__, __mul__, __radd__, and __rmul__, as mentioned in “Sequences”.

Table 4-4.  

__abs__, __invert__, __neg__, __pos__

__abs__(self) __invert__(self) __neg__(self) __pos__(self)

The unary operators abs(x), ~x, -x, and +x, respectively, call these methods.

__add__, __mod__, __mul__, __sub__

__add__(self,other) __mod__(self,other)
__mul__(self,
other) __sub__(self,other)

The operators x+y, x%y, x*y, and x-y, and x/y, respectively, call these methods, usually for arithmetic computations.

__div__, __floordiv__, __truediv__

__div__(self,other) __floordiv__(self,other)
__truediv__(self,
other)

The operators x/y and x//y call these methods, usually for arithmetic divisions. In v2, operator / calls __truediv__, if present, instead of __div__, in situations where division is nontruncating, as covered in “Arithmetic Operations”. In v3, there is no __div__, only __truediv__ and __floordiv__.

__and__, __lshift__, __or__, __rshift__, __xor__

__and__(self,other) __lshift__(self,other)
__or__(self,
other) __rshift__(self,other)
__xor__(self,
other)

The operators x&y, x<<y, x|y, x>>y, and x^y, respectively, call these methods, usually for bitwise operations.

__complex__, __float__, __int__, __long__

__complex__(self) __float__(self) __int__(self)
__long__(self)

The built-in types complex(x), float(x), int(x), and (in v2 only) long(x), respectively, call these methods.

__divmod__

__divmod__(self,other)

The built-in function divmod(x,y) calls x.__divmod__(y). __divmod__ should return a pair (quotient,remainder) equal to (x//y,x%y).

__hex__, __oct__

__hex__(self) __oct__(self)

In v2 only, the built-in function hex(x) calls x.__hex__(), and built-in function oct(x) calls x.__oct__(). Each of these special methods should return a string representing the value of x, in base 16 and 8, respectively. In v3, these special methods don’t exist: the built-in functions hex and oct operate directly on the result of calling the special method __index__ on their operand.

__iadd__, __idiv__, __ifloordiv__, __imod__, __imul__, __isub__, __itruediv__

__iadd__(self,other) __idiv__(self,other)
__ifloordiv__(self,
other) __imod__(self,other)
__imul__(self,
other) __isub__(self,other)
__itruediv__(self,
other)

The augmented assignments x+=y, x/=y, x//=y, x%=y, x*=y, x-=y, and x/=y, respectively, call these methods. Each method should modify x in place and return self. Define these methods when x is mutable (i.e., when x can change in place).

__iand__, __ilshift__, __ior__, __irshift__, __ixor__

__iand__(self,other) __ilshift__(self,other)
__ior__(self,
other) __irshift__(self,other)
__ixor__(self,
other)

The augmented assignments x&=y, x<<=y, x|=y, x>>=y, and x^=y, respectively, call these methods. Each method should modify x in place and return self.

__index__

__index__(self)

Like __int__, but meant to be supplied only by types that are alternative implementations of integers (in other words, all of the type’s instances can be exactly mapped into integers). For example, out of all built-in types, only int (and, in v2, long) supply __index__; float and str don’t, although they do supply __int__. Sequence indexing and slicing internally use __index__ to get the needed integer indices.

__ipow__

__ipow__(self,other)

The augmented assignment x**=y calls x.__ipow__(y). __ipow__ should modify x in place and return self.

__pow__

__pow__(self,other[,modulo])

x**y and pow(x,y) both call x.__pow__(y), while pow(x,y,z) calls x.__pow__(y,z). x.__pow__(y,z) should return a value equal to the expression x.__pow__(y)%z.

__radd__, __rdiv__, __rmod__, __rmul__, __rsub__

__radd__(self,other) __rdiv__(self,other)
__rmod__(self,
other) __rmul__(self,other)
__rsub__(self,
other)

The operators y+x, y/x, y%x, y*x, and y-x, respectively, call these methods on x when y doesn’t have a needed method __add__, __div__, and so on, or when that method returns NotImplemented.

__rand__, __rlshift__, __ror__, __rrshift__, __rxor__

__rand__(self,other) __rlshift__(self,other)
__ror__(self,
other) __rrshift__(self,other)
__rxor__(self,
other)

The operators y&x, y<<x, y|x, y>>x, and x^y, respectively, call these methods on x when y doesn’t have a needed method __and__, __lshift__, and so on, or when that method returns NotImplemented.

__rdivmod__

__rdivmod__(self,other)

The built-in function divmod(y,x) calls x.__rdivmod__(y) when y doesn’t have __divmod__, or when that method returns NotImplemented. __rdivmod__ should return a pair (remainder,quotient).

__rpow__

__rpow__(self,other)

y**x and pow(y,x) call x.__rpow__(y) when y doesn’t have __pow__, or when that method returns NotImplemented. There is no three-argument form in this case.

Decorators

In Python, you often use so-called higher-order functions, callables that accept a function as an argument and return a function as their result. For example, descriptor types such as staticmethod and classmethod, covered in “Class-Level Methods”, can be used, within class bodies, as:

def f(cls, ...):
  ...definition of f snipped...
f = classmethod(f)

However, having the call to classmethod textually after the def statement decreases code readability: while reading f’s definition, the reader of the code is not yet aware that f is going to become a class method rather than an instance method. The code is more readable if the mention of classmethod comes before, not after, the def. For this purpose, use the syntax form known as decoration:

@classmethod
def f(cls, ...):
  ...definition of f snipped...

The decorator must be immediately followed by a def statement and means that f=classmethod(f) executes right after the def statement (for whatever name f the def defines). More generally, @expression evaluates the expression (which must be a name, possibly qualified, or a call) and binds the result to an internal temporary name (say, __aux); any decorator must be immediately followed by a def (or class) statement, and means that f=__aux(f) executes right after the def or class (for whatever name f the def or class defines). The object bound to __aux is known as a decorator, and it’s said to decorate function or class f.

Decorators afford a handy shorthand for some higher-order functions. You may apply decorators to any def or class statement, not just ones occurring in class bodies. You may also code custom decorators, which are just higher-order functions, accepting a function or class object as an argument and returning a function or class object as the result. For example, here is a simple example decorator that does not modify the function it decorates, but rather prints the function’s docstring to standard output at function-definition time:

def showdoc(f):
    if f.__doc__:
        print('{}: {}'.format(f.__name__, f.__doc__))
    else:
        print('{}: No docstring!'.format(f.__name__))
    return f

@showdoc
def f1(): """a docstring"""
# prints: f1: a docstring

@showdoc
def f2(): pass
# prints: f2: No docstring!

The standard library module functools offers a handy decorator, wraps, to enhance decorators built by the common “wrapping” idiom:

import functools
def announce(f):
    @functools.wraps(f)
    def wrap(*a, **k):
        print('Calling {}'.format(f.__name__))
        return f(*a, **k)
    return wrap

Decorating a function f with @announce causes a line announcing the call to be printed before each call to f. Thanks to the functools.wraps(f) decorator, the wrapper function adopts the name and docstring of the wrapped one, which is useful, for example, when calling the built-in help on such a decorated function.

Metaclasses

Any object, even a class object, has a type. In Python, types and classes are also first-class objects. The type of a class object is also known as the class’s metaclass.4 An object’s behavior is mostly determined by the type of the object. This also holds for classes: a class’s behavior is mostly determined by the class’s metaclass. Metaclasses are an advanced subject, and you may want to skip the rest of this section. However, fully grasping metaclasses can lead you to a deeper understanding of Python, and, very occasionally, it can be useful to define your own custom metaclasses.5

How Python v2 Determines a Class’s Metaclass

To execute a class statement, v2 first collects the base classes into a tuple t (an empty one if there are no base classes) and executes the class body, storing the names there defined in a temporary dictionary d. Then, Python determines the metaclass M to use for the new class object C that the class statement is creating.

When '__metaclass__' is a key in d, M is d['__metaclass__']. Thus, you can explicitly control class C’s metaclass by binding the attribute __metaclass__ in C’s class body (for clarity, do that as the first statement in the class body, right after the docstring). Otherwise, when t is nonempty (i.e., when C has one or more base classes), M is the leafmost metaclass among all of the metaclasses of C’s bases (i.e., the metaclass of a base of C that issubclass of all other metaclasses of bases of C; it is an error, and Python raises an exception, if no metaclass of a base of C issubclass of all others).6

This is why inheriting from object, in v2, indicates that C is a new-style class (rather than a legacy-style class, a concept that exists only for backward compatibility and that this book does not cover). Since type(object) is type, a class C that inherits from object (or some other built-in type) gets the same metaclass as object (i.e., type(C), C’s metaclass, is also type). Thus, in v2, “being a new-style class” is synonymous with “having type as the metaclass.”

When C has no base classes, but the current module has a global variable __metaclass__, in v2, M is the value of that global variable. This lets you make classes without bases default to being new-style classes, rather than legacy classes, throughout a module. Just place the following statement toward the start of the module body, before any class statement:

__metaclass__ = type

When none of these conditions applies (that is, when the metaclass is not explicitly specified, not inherited, and not defined as the module’s global variable), M, in v2, defaults to types.ClassType (making an old-style, legacy class).

How Python v3 Determines a Class’s Metaclass

In v3, the class statement accepts optional named arguments (after the bases, if any). The most important named argument is metaclass, which, if present, identifies the new class’s metaclass. Other named arguments are allowed only if a non-type metaclass is present, and in this case they are passed on to the __prepare__ method of the metaclass (it’s entirely up to said method __prepare__ to make use of such named arguments). When the named argument metaclass is absent, v3 determines the metaclass by inheritance, exactly like v2, or else, for classes with no explicitly specified bases, defaults to type.

In v3, a metaclass has an optional method __prepare__, which Python calls as soon as it determines the metaclass, as follows:

class MC:
    def __prepare__(classname, *classbases, **kwargs):
        return {}
  ...rest of MC snipped...

class X(onebase, another, metaclass=MC, foo='bar'):
  ...body of X snipped...

Here, the call is equivalent to MC.__prepare__('X', onebase, another, foo='bar'). __prepare__, if present, must return a mapping (most often just a dictionary), which Python uses as the d in which it executes the class body. If a metaclass wants to retain the order in which names get bound in the class body, its __prepare__ method can return an instance of collections.OrderedDict (covered in “OrderedDict”); this allows a v3 metaclass to use a class to naturally represent constructs in which the ordering of attributes matters, such as database schemas (a feature that is just not possible in v2).7 If __prepare__ is absent, v3 uses a dictionary as d (as v2 does in all cases).

How a Metaclass Creates a Class

Having determined M, Python calls M with three arguments: the class name (a string), the tuple of base classes t, and the dictionary (or, in v3, other mapping resulting from __prepare__) d in which the class body just finished executing. The call returns the class object C, which Python then binds to the class name, completing the execution of the class statement. Note that this is in fact an instantiation of type M, so the call to M executes M.__init__(C, namestring, t, d), where C is the return value of M.__new__(M, namestring, t, d), just as in any other instantiation of a class.

After Python creates class object C, the relationship between class C and its type (type(C), normally M) is the same as that between any object and its type. For example, when you call the class object C (to create an instance of C), M.__call__ executes with class object C as the first argument.

Note the benefit, in this context, of the approach described in “Per-Instance Methods”, whereby special methods are looked up only on the class, not on the instance—the core difference in the “new style” object model versus v2’s old “legacy style.” Calling C to instantiate it must execute the metaclass’s M.__call__, whether or not C has a per-instance attribute (method) __call__ (i.e., independently of whether instances of C are or aren’t callable). This is simply impossible in an object model like the legacy one, where per-instance methods override per-class ones for implicitly called special methods. The Python object model (specifically the “new-style” one in v2, and the only one existing in v3) avoids having to make the relationship between a class and its metaclass an ad hoc special case. Avoiding ad hoc special cases is a key to Python’s power: Python has few, simple, general rules, and applies them consistently.

Defining and using your own metaclasses

It’s easy to define custom metaclasses: inherit from type and override some of its methods. You can also perform most of these tasks with __new__, __init__, __getattribute__, and so on, without involving metaclasses. However, a custom metaclass can be faster, since special processing is done only at class creation time, which is a rare operation. A custom metaclass lets you define a whole category of classes in a framework that magically acquire whatever interesting behavior you’ve coded, quite independently of what special methods the classes themselves may choose to define.

A good alternative, to alter a specific class in an explicit way, is often to use a class decorator, as mentioned in “Decorators”. However, decorators are not inherited, so the decorator must be explicitly applied to each class of interest. Metaclasses, on the other hand, are inherited; in fact, when you define a custom metaclass M, it’s usual to also define an otherwise-empty class C with metaclass M, so that other classes requiring metaclass M can just inherit from C.

Some behavior of class objects can be customized only in metaclasses. The following example shows how to use a metaclass to change the string format of class objects:

class MyMeta(type):
    def __str__(cls):
        return 'Beautiful class {!r}'.format(cls.__name__)
class MyClass(metaclass=MyMeta):
    # in v2, remove the `metaclass=`, and use, as the class body:
    # __metaclass__ = MyMeta
x = MyClass()
print(type(x))      # prints: Beautiful class 'MyClass'

A substantial custom metaclass example

Suppose that, programming in Python, we miss C’s struct type: an object that is just a bunch of data attributes, in order, with fixed names (collections.namedtuple, covered in “namedtuple”, comes close, but named tuples are immutable, and we may not want that in some cases). Python lets us easily define a generic Bunch class, apart from the fixed order and names:

class SimpleBunch(object):
    def __init__(self, **fields):
        self.__dict__ = fields
p = SimpleBunch(x=2.3, y=4.5)
print(p)       # prints: <__main__.SimpleBunch object at 0x00AE8B10>

A custom metaclass lets us exploit the fact that the attribute names are fixed at class creation time. The code shown in Example 4-1 defines a metaclass, MetaBunch, and a class, Bunch, that let us write code like the following:

class Point(Bunch):
    """ A point has x and y coordinates, defaulting to 0.0, 
        and a color, defaulting to 'gray'—and nothing more, 
        except what Python and the metaclass conspire to add, 
        such as __init__ and __repr__
    """
    x = 0.0
    y = 0.0
    color = 'gray'
# example uses of class Point
q = Point()
print(q)                    # prints: Point()
p = Point(x=1.2, y=3.4)
print(p)                    # prints: Point(y=3.399999999, x=1.2)

In this code, the print calls emit readable string representations of our Point instances. Point instances are quite memory-lean, and their performance is basically the same as for instances of the simple class SimpleBunch in the previous example (there is no extra overhead due to implicit calls to special methods). Example 4-1 is quite substantial, and following all its details requires understanding aspects of Python covered later in this book, such as strings (Chapter 8) and module warnings (“The warnings Module”). The identifier mcl used in Example 4-1 stands for “metaclass,” clearer in this special advanced case than the habitual case of cls standing for “class.”

The example’s code works in both v2 and v3, except that, in v2, a tiny syntax adjustment must be made at the end, in Bunch, as shown in that class’s docstring.

Example 4-1. The MetaBunch metaclass
import collections
import warnings

class MetaBunch(type):
    """
    Metaclass for new and improved "Bunch": implicitly defines
    __slots__, __init__ and __repr__ from variables bound in
    class scope.
    A class statement for an instance of MetaBunch (i.e., for a
    class whose metaclass is MetaBunch) must define only
    class-scope data attributes (and possibly special methods, but
    NOT __init__ and __repr__).  MetaBunch removes the data
    attributes from class scope, snuggles them instead as items in
    a class-scope dict named __dflts__, and puts in the class a
    __slots__ with those attributes' names, an __init__ that takes
    as optional named arguments each of them (using the values in
    __dflts__ as defaults for missing ones), and a __repr__ that
    shows the repr of each attribute that differs from its default
    value (the output of __repr__ can be passed to __eval__ to make
    an equal instance, as per usual convention in the matter, if
    each non-default-valued attribute respects the convention too).

    In v3, the order of data attributes remains the same as in the
    class body; in v2, there is no such guarantee.
    """
    def __prepare__(name, *bases, **kwargs):
        # precious in v3—harmless although useless in v2
        return collections.OrderedDict()

    def __new__(mcl, classname, bases, classdict):
        """ Everything needs to be done in __new__, since
            type.__new__ is where __slots__ are taken into account.
        """
        # define as local functions the __init__ and __repr__ that
        # we'll use in the new class
        def __init__(self, **kw):
            """ Simplistic __init__: first set all attributes to
                default values, then override those explicitly
                passed in kw.
            """
            for k in self.__dflts__:
                setattr(self, k, self.__dflts__[k])
            for k in kw:
                setattr(self, k, kw[k])
        def __repr__(self):
            """ Clever __repr__: show only attributes that differ
                from default values, for compactness.
            """
            rep = ['{}={!r}'.format(k, getattr(self, k))
                    for k in self.__dflts__
                    if getattr(self, k) != self.__dflts__[k]
                  ]
            return '{}({})'.format(classname, ', '.join(rep))
        # build the newdict that we'll use as class-dict for the
        # new class
        newdict = { '__slots__':[], 
            '__dflts__':collections.OrderedDict(),
            '__init__':__init__, '__repr__':__repr__, }
        for k in classdict:
            if k.startswith('__') and k.endswith('__'):
                # dunder methods: copy to newdict, or warn
                # about conflicts
                if k in newdict:
                    warnings.warn(
                        "Can't set attr {!r} in bunch-class {!r}".
                        format(k, classname))
                else:
                    newdict[k] = classdict[k]
            else:
                # class variables, store name in __slots__, and
                # name and value as an item in __dflts__
                newdict['__slots__'].append(k)
                newdict['__dflts__'][k] = classdict[k]
        # finally delegate the rest of the work to type.__new__
        return super(MetaBunch, mcl).__new__(
                     mcl, classname, bases, newdict)

class Bunch(metaclass=MetaBunch):
    """ For convenience: inheriting from Bunch can be used to get
        the new metaclass (same as defining metaclass= yourself).

        In v2, remove the (metaclass=MetaBunch) above and add
        instead __metaclass__=MetaBunch as the class body.
    """
    pass

1 To complete, for once, the usually truncated famous quote: “except of course for the problem of too many indirections.”

2 Third-party extensions can also define types of containers that are not sequences, not mappings, and not sets.

3 Lower-bound included; upper-bound excluded—as always, the norm for Python.

4 Strictly speaking, the type of a class C could be said to be the metaclass only of instances of C rather than of C itself, but this subtle distinction is rarely, if ever, observed in practice.

5 New in 3.6: while metaclasses work in 3.6 just like in 3.5, 3.6 also offers simpler ways to customize class creation (covered in PEP 487) that are often a good alternative to using a custom metaclass.

6 In other, more precise (but fancy) words, if C’s bases’ metaclasses do not form an inheritance lattice including its lower bound—that is, if there is no leafmost metaclass, no metaclass that issubclass of all others—Python raises an exception diagnosing this metatype conflict.

7 New in 3.6: it is no longer necessary to use OrderedDict here—3.6 guarantees the mapping used here preserves key order.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required