O'Reilly logo

Python in a Nutshell, 2nd Edition by Alex Martelli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Functions

Most statements in a typical Python program are grouped and organized into functions (code in a function body may be faster than at a module’s top level, as covered in Avoiding exec and from...import *, so there are excellent practical reasons to put most of your code into functions). A function is a group of statements that execute upon request. Python provides many built-in functions and allows programmers to define their own functions. A request to execute a function is known as a function call. When you call a function, you can pass arguments that specify data upon which the function performs its computation. In Python, a function always returns a result value, either None or a value that represents the results of the computation. Functions defined within class statements are also known as methods. Issues specific to methods are covered in Bound and Unbound Methods; the general coverage of functions in this section, however, also applies to methods.

In Python, functions are objects (values) that are handled like other objects. Thus, you can pass a function as an argument in a call to another function. Similarly, a function can return another function as the result of a call. A function, just like any other object, can be bound to a variable, an item in a container, or an attribute of an object. Functions can also be keys into a dictionary. For example, if you need to quickly find a function’s inverse given the function, you could define a dictionary whose keys and values are functions and then make the dictionary bidirectional. Here’s a small example of this idea, using some functions from module math, covered in The math and cmath Modules:

inverse = {sin:asin, cos:acos, tan:atan, log:exp}
for f in inverse.keys( ): inverse[inverse[f]] = f

The fact that functions are ordinary objects in Python is often expressed by saying that functions are first-class objects.

The def Statement

The def statement is the most common way to define a function. def is a single-clause compound statement with the following syntax:

def function-name(parameters):
    statement(s)

function-name is an identifier. It is a variable that gets bound (or rebound) to the function object when def executes.

parameters is an optional list of identifiers, known as formal parameters or just parameters, that get bound to the values supplied as arguments when the function is called. In the simplest case, a function doesn’t have any formal parameters, which means the function doesn’t take any arguments when it is called. In this case, the function definition has empty parentheses after function-name.

When a function does take arguments, parameters contains one or more identifiers, separated by commas (,). In this case, each call to the function supplies values, known as arguments, corresponding to the parameters listed in the function definition. The parameters are local variables of the function (as we’ll discuss later in this section), and each call to the function binds these local variables to the corresponding values that the caller supplies as arguments.

The nonempty sequence of statements, known as the function body, does not execute when the def statement executes. Rather, the function body executes later, each time the function is called. The function body can contain zero or more occurrences of the return statement, as we’ll discuss shortly.

Here’s an example of a simple function that returns a value that is twice the value passed to it each time it’s called:

def double(x):
    return x*2

Parameters

Formal parameters that are just identifiers indicate mandatory parameters. Each call to the function must supply a corresponding value (argument) for each mandatory parameter.

In the comma-separated list of parameters, zero or more mandatory parameters may be followed by zero or more optional parameters, where each optional parameter has the syntax:

identifier=expression

The def statement evaluates each such expression and saves a reference to the expression’s value, known as the default value for the parameter, among the attributes of the function object. When a function call does not supply an argument corresponding to an optional parameter, the call binds the parameter’s identifier to its default value for that execution of the function. Note that each default value gets computed when the def statement evaluates, not when the resulting function gets called. In particular, this means that the same object, the default value, gets bound to the optional parameter whenever the caller does not supply a corresponding argument. This can be tricky when the default value is a mutable object and the function body alters the parameter. For example:

def f(x, y=[]):
    y.append(x)
    return y
print f(23)                # prints: [23]
prinf f(42)                # prints: [23, 42]

The second print statement prints [23, 42] because the first call to f altered the default value of y, originally an empty list [], by appending 23 to it. If you want y to be bound to a new empty list object each time f is called with a single argument, use the following style instead:

def f(x, y=None):
    if y is None: y = []
    y.append(x)
    return y
print f(23)                # prints: [23]
prinf f(42)                # prints: [42]

At the end of the parameters, you may optionally use either or both of the special forms *identifier1 and **identifier2. If both forms are present, the form with two asterisks must be last. *identifier1 specifies that any call to the function may supply any number of extra positional arguments, while **identifier2 specifies that any call to the function may supply any number of extra named arguments (positional and named arguments are covered in Calling Functions). Every call to the function binds identifier1 to a tuple whose items are the extra positional arguments (or the empty tuple, if there are none). Similarly, identifier2 gets bound to a dictionary whose items are the names and values of the extra named arguments (or the empty dictionary, if there are none). Here’s a function that accepts any number of positional arguments and returns their sum:

def sum_args(*numbers):
    return sum(numbers)
print sum_args(23, 42)           # prints: 65

The number of parameters of a function, together with the parameters’ names, the number of mandatory parameters, and the information on whether (at the end of the parameters) either or both of the single- and double-asterisk special forms are present, collectively form a specification known as the function’s signature. A function’s signature defines the ways in which you can call the function.

Attributes of Function Objects

The def statement sets some attributes of a function object. The attribute func_name, also accessible as _ _name_ _, refers to the identifier string given as the function name in the def statement. In Python 2.3, this is a read-only attribute (trying to rebind or unbind it raises a runtime exception); in Python 2.4, you may rebind the attribute to any string value, but trying to unbind it raises an exception. The attribute func_defaults, which you may freely rebind or unbind, refers to the tuple of default values for the optional parameters (or the empty tuple, if the function has no optional parameters).

Docstrings

Another function attribute is the documentation string, also known as the docstring. You may use or rebind a function’s docstring attribute as either func_doc or _ _doc_ _. If the first statement in the function body is a string literal, the compiler binds that string as the function’s docstring attribute. A similar rule applies to classes (see Class documentation strings) and modules (see Module documentation strings). Docstrings most often span multiple physical lines, so you normally specify them in triple-quoted string literal form. For example:

def sum_args(*numbers):
    '''Accept arbitrary numerical arguments and return their sum.
    The arguments are zero or more numbers.  The result is their sum.'''
    return sum(numbers)

Documentation strings should be part of any Python code you write. They play a role similar to that of comments in any programming language, but their applicability is wider, since they remain available at runtime. Development environments and tools can use docstrings from function, class, and module objects to remind the programmer how to use those objects. The doctest module (covered in The doctest Module) makes it easy to check that sample code present in docstrings is accurate and correct.

To make your docstrings as useful as possible, you should respect a few simple conventions. The first line of a docstring should be a concise summary of the function’s purpose, starting with an uppercase letter and ending with a period. It should not mention the function’s name, unless the name happens to be a natural-language word that comes naturally as part of a good, concise summary of the function’s operation. If the docstring is multiline, the second line should be empty, and the following lines should form one or more paragraphs, separated by empty lines, describing the function’s parameters, preconditions, return value, and side effects (if any). Further explanations, bibliographical references, and usage examples (which you should check with doctest) can optionally follow toward the end of the docstring.

Other attributes of function objects

In addition to its predefined attributes, a function object may have other arbitrary attributes. To create an attribute of a function object, bind a value to the appropriate attribute reference in an assignment statement after the def statement executes. For example, a function could count how many times it gets called:

def counter( ):
    counter.count += 1
    return counter.count
counter.count = 0

Note that this is not common usage. More often, when you want to group together some state (data) and some behavior (code), you should use the object-oriented mechanisms covered in Chapter 5. However, the ability to associate arbitrary attributes with a function can sometimes come in handy.

The return Statement

The return statement in Python is allowed only inside a function body and can optionally be followed by an expression. When return executes, the function terminates, and the value of the expression is the function’s result. A function returns None if it terminates by reaching the end of its body or by executing a return statement that has no expression (or, of course, by executing return None).

As a matter of style, you should never write a return statement without an expression at the end of a function body. If some return statements in a function have an expression, all return statements should have an expression. return None should only be written explicitly to meet this style requirement. Python does not enforce these stylistic conventions, but your code will be clearer and more readable if you follow them.

Calling Functions

A function call is an expression with the following syntax:

function-object(arguments)

function-object may be any reference to a function (or other callable) object; most often, it’s the function’s name. The parentheses denote the function-call operation itself. arguments, in the simplest case, is a series of zero or more expressions separated by commas (,), giving values for the function’s corresponding parameters. When the function call executes, the parameters are bound to the argument values, the function body executes, and the value of the function-call expression is whatever the function returns.

Note that just mentioning a function (or other callable object) does not call it. To call a function (or other object) without arguments, you must use () after the function’s name.

The semantics of argument passing

In traditional terms, all argument passing in Python is by value. For example, if you pass a variable as an argument, Python passes to the function the object (value) to which the variable currently refers, not “the variable itself.” Thus, a function cannot rebind the caller’s variables. However, if you pass a mutable object as an argument, the function may request changes to that object because Python passes the object itself, not a copy. Rebinding a variable and mutating an object are totally disjoint concepts. For example:

def f(x, y):
    x = 23
    y.append(42)
a = 77
b = [99]
f(a, b)
print a, b                # prints: 77 [99, 42]

The print statement shows that a is still bound to 77. Function f’s rebinding of its parameter x to 23 has no effect on f’s caller, nor, in particular, on the binding of the caller’s variable that happened to be used to pass 77 as the parameter’s value. However, the print statement also shows that b is now bound to [99, 42]. b is still bound to the same list object as before the call, but that object has mutated, as f has appended 42 to that list object. In either case, f has not altered the caller’s bindings, nor can f alter the number 77, since numbers are immutable. However, f can alter a list object, since list objects are mutable. In this example, f mutates the list object that the caller passes to f as the second argument by calling the object’s append method.

Kinds of arguments

Arguments that are just expressions are known as positional arguments. Each positional argument supplies the value for the parameter that corresponds to it by position (order) in the function definition.

In a function call, zero or more positional arguments may be followed by zero or more named arguments, each with the following syntax:

identifier=expression

The identifier must be one of the parameter names used in the def statement for the function. The expression supplies the value for the parameter of that name. Most built-in functions do not accept named arguments, you must call such functions with positional arguments only. However, all normal functions coded in Python accept named as well as positional arguments, so you may call them in different ways.

A function call must supply, via a positional or a named argument, exactly one value for each mandatory parameter, and zero or one value for each optional parameter. For example:

def divide(divisor, dividend):
    return dividend // divisor
print divide(12, 94)                         # prints: 7
print divide(dividend=94, divisor=12)        # prints: 7

As you can see, the two calls to divide are equivalent. You can pass named arguments for readability purposes whenever you think that identifying the role of each argument and controlling the order of arguments enhances your code’s clarity.

A common use of named arguments is to bind some optional parameters to specific values, while letting other optional parameters take default values:

def f(middle, begin='init', end='finis'):
    return begin+middle+end
print f('tini', end='')                     # prints: inittini

Thanks to named argument end='', the caller can specify a value, the empty string '', for f’s third parameter, end, and still let f’s second parameter, begin, use its default value, the string 'init'.

At the end of the arguments in a function call, you may optionally use either or both of the special forms *seq and **dct. If both forms are present, the form with two asterisks must be last. *seq passes the items of seq to the function as positional arguments (after the normal positional arguments, if any, that the call gives with the usual syntax). seq may be any iterable. **dct passes the items of dct to the function as named arguments, where dct must be a dictionary whose keys are all strings. Each item’s key is a parameter name, and the item’s value is the argument’s value.

Sometimes you want to pass an argument of the form *seq or **dct when the parameters use similar forms, as described earlier in Parameters. For example, using the function sum_args defined in that section (and shown again here), you may want to print the sum of all the values in dictionary d. This is easy with *seq:

def sum_args(*numbers):
    return sum(numbers)
print sum_args(*d.values( ))

(Of course, in this case, print sum(d.values( )) would be simpler and more direct!)

However, you may also pass arguments of the form *seq or **dct when calling a function that does not use the corresponding forms in its parameters. In that case, of course, you must ensure that iterable seq has the right number of items, or, respectively, that dictionary dct uses the right names as its keys; otherwise, the call operation raises an exception.

Namespaces

A function’s parameters, plus any variables that are bound (by assignment or by other binding statements, such as def) in the function body, make up the function’s local namespace, also known as local scope. Each of these variables is known as a local variable of the function.

Variables that are not local are known as global variables (in the absence of nested function definitions, which we’ll discuss shortly). Global variables are attributes of the module object, as covered in Attributes of module objects. Whenever a function’s local variable has the same name as a global variable, that name, within the function body, refers to the local variable, not the global one. We express this by saying that the local variable hides the global variable of the same name throughout the function body.

The global statement

By default, any variable that is bound within a function body is a local variable of the function. If a function needs to rebind some global variables, the first statement of the function must be:

global identifiers

where identifiers is one or more identifiers separated by commas (,). The identifiers listed in a global statement refer to the global variables (i.e., attributes of the module object) that the function needs to rebind. For example, the function counter that we saw in Other attributes of function objects could be implemented using global and a global variable, rather than an attribute of the function object:

_count = 0
def counter( ):
    global _count
    _count += 1
    return _count

Without the global statement, the counter function would raise an UnboundLocalError exception because _count would then be an uninitialized (unbound) local variable. While the global statement enables this kind of programming, this style is often inelegant and unadvisable. As I mentioned earlier, when you want to group together some state and some behavior, the object-oriented mechanisms covered in Chapter 5 are usually best.

Don’t use global if the function body just uses a global variable (including mutating the object bound to that variable if the object is mutable). Use a global statement only if the function body rebinds a global variable (generally by assigning to the variable’s name). As a matter of style, don’t use global unless it’s strictly necessary, as its presence will cause readers of your program to assume the statement is there for some useful purpose. In particular, never use global except as the first statement in a function body.

Nested functions and nested scopes

A def statement within a function body defines a nested function, and the function whose body includes the def is known as an outer function to the nested one. Code in a nested function’s body may access (but not rebind) local variables of an outer function, also known as free variables of the nested function.

The simplest way to let a nested function access a value is often not to rely on nested scopes, but rather to explicitly pass that value as one of the function’s arguments. If necessary, the argument’s value can be bound when the nested function is defined by using the value as the default for an optional argument. For example:

def percent1(a, b, c):
    def pc(x, total=a+b+c): return (x*100.0) / total
    print "Percentages are:", pc(a), pc(b), pc(c)

Here’s the same functionality using nested scopes:

def percent2(a, b, c):
    def pc(x): return (x*100.0) / (a+b+c)
    print "Percentages are:", pc(a), pc(b), pc(c)

In this specific case, percent1 has a tiny advantage: the computation of a+b+c happens only once, while percent2’s inner function pc repeats the computation three times. However, if the outer function rebinds its local variables between calls to the nested function, repeating the computation can be necessary. It’s therefore advisable to be aware of both approaches, and choose the most appropriate one case by case.

A nested function that accesses values from outer local variables is also known as a closure. The following example shows how to build a closure:

def make_adder(augend):
    def add(addend):
        return addend+augend
    return add

Closures are an exception to the general rule that the object-oriented mechanisms covered in Chapter 5 are the best way to bundle together data and code. When you need specifically to construct callable objects, with some parameters fixed at object construction time, closures can be simpler and more effective than classes. For example, the result of make_adder(7) is a function that accepts a single argument and adds 7 to that argument. An outer function that returns a closure is a “factory” for members of a family of functions distinguished by some parameters, such as the value of argument augend in the previous example, and may often help you avoid code duplication.

lambda Expressions

If a function body is a single return expression statement, you may choose to replace the function with the special lambda expression form:

lambda parameters: expression

A lambda expression is the anonymous equivalent of a normal function whose body is a single return statement. Note that the lambda syntax does not use the return keyword. You can use a lambda expression wherever you could use a reference to a function. lambda can sometimes be handy when you want to use a simple function as an argument or return value. Here’s an example that uses a lambda expression as an argument to the built-in filter function (covered in filter in Built-in Functions):

aList = [1, 2, 3, 4, 5, 6, 7, 8, 9]
low = 3
high = 7
filter(lambda x, l=low, h=high: h>x>l, aList)    # returns: [4, 5, 6]

As an alternative, you can always use a local def statement that gives the function object a name. You can then use this name as the argument or return value. Here’s the same filter example using a local def statement:

aList = [1, 2, 3, 4, 5, 6, 7, 8, 9]
low = 3
high = 7
def within_bounds(value, l=low, h=high):
    return h>value>l
filter(within_bounds, aList)                     # returns: [4, 5, 6]

While lambda can occasionally be useful, many Python users prefer def, which is more general, and may make your code more readable if you choose a reasonable name for the function.

Generators

When the body of a function contains one or more occurrences of the keyword yield, the function is known as a generator. When you call a generator, the function body does not execute. Instead, calling the generator returns a special iterator object that wraps the function body, its local variables (including its parameters), and the current point of execution, which is initially the start of the function.

When the next method of this iterator object is called, the function body executes up to the next yield statement, which takes the form:

yield expression

When a yield statement executes, the function execution is “frozen,” with current point of execution and local variables intact, and the expression following yield is returned as the result of the next method. When next is called again, execution of the function body resumes where it left off, again up to the next yield statement. If the function body ends, or executes a return statement, the iterator raises a StopIteration exception to indicate that the iteration is finished. return statements in a generator cannot contain expressions.

A generator is a very handy way to build an iterator. Since the most common way to use an iterator is to loop on it with a for statement, you typically call a generator like this:

for avariable in somegenerator(arguments):

For example, say that you want a sequence of numbers counting up from 1 to N and then down to 1 again. A generator can help:

def updown(N):
    for x in xrange(1, N): yield x
    for x in xrange(N, 0, -1): yield x
for i in updown(3): print i                   # prints: 1 2 3 2 1

Here is a generator that works somewhat like the built-in xrange function, but returns a sequence of floating-point values instead of a sequence of integers:

def frange(start, stop, step=1.0):
    while start < stop:
        yield start
        start += step

This frange example is only somewhat like xrange because, for simplicity, it makes arguments start and stop mandatory, and silently assumes step is positive.

Generators are more flexible than functions that returns lists. A generator may build an unbounded iterator, meaning one that returns an infinite stream of results (to use only in loops that terminate by other means, e.g., via a break statement). Further, a generator-built iterator performs lazy evaluation: the iterator computes each successive item only when and if needed, just in time, while the equivalent function does all computations in advance and may require large amounts of memory to hold the results list. Therefore, if all you need is the ability to iterate on a computed sequence, it is often best to compute the sequence in a generator rather than in a function that returns a list. If the caller needs a list of all the items produced by some bounded generator G(arguments), the caller can simply use the following code:

resulting_list = list(G(arguments))

Generator expressions

Python 2.4 introduces an even simpler way to code particularly simple generators: generator expressions, commonly known as genexps. The syntax of a genexp is just like that of a list comprehension (as covered in List comprehensions) except that a genexp is enclosed in parentheses (( )) instead of brackets ([]); the semantics of a genexp are the same as those of the corresponding list comprehension, except that a genexp produces an iterator yielding one item at a time, while a list comprehension produces a list of all results in memory (therefore, using a genexp, when appropriate, saves memory). For example, to sum the squares of all single-digit integers, in any modern Python, you can code sum([x*x for x in xrange(10)]); in Python 2.4, you can express this functionality even better, coding it as sum(x*x for x in xrange(10)) (just the same, but omitting the brackets), and obtain exactly the same result while consuming less memory. Note that the parentheses that indicate the function call also “do double duty” and enclose the genexp (no need for extra parentheses).

Generators in Python 2.5

In Python 2.5, generators are further enhanced, with the possibility of receiving a value (or an exception) back from the caller as each yield executes. These advanced features allow generators in 2.5 to implement full-fledged co-routines, as explained at http://www.python.org/peps/pep-0342.html. The main change is that, in 2.5, yield is not a statement, but an expression, so it has a value. When a generator is resumed by calling its method next, the corresponding yield’s value is None. To pass a value x into some generator g (so that g receives x as the value of the yield on which it’s suspended), instead of calling g.next( ), the caller calls g.send(x) (calling g.send(None) is just like calling g.next( )). Also, a bare yield without arguments, in Python 2.5, becomes legal, and equivalent to yield None.

Other Python 2.5 enhancements to generators have to do with exceptions, and are covered in Generator enhancements.

Recursion

Python supports recursion (i.e., a Python function can call itself), but there is a limit to how deep the recursion can be. By default, Python interrupts recursion and raises a RecursionLimitExceeded exception (covered in Standard Exception Classes) when it detects that the stack of recursive calls has gone over a depth of 1,000. You can change the recursion limit with function setrecursionlimit of module sys, covered in setrecursionlimit in The sys Module.

However, changing the recursion limit does not give you unlimited recursion; the absolute maximum limit depends on the platform on which your program is running, particularly on the underlying operating system and C runtime library, but it’s typically a few thousand levels. If recursive calls get too deep, your program crashes. Such runaway recursion, after a call to setrecursionlimit that exceeds the platform’s capabilities, is one of the very few ways a Python program can crash—really crash, hard, without the usual safety net of Python’s exception mechanisms. Therefore, be wary of trying to fix a program that is getting RecursionLimitExceeded exceptions by raising the recursion limit too high with setrecursionlimit. Most often, you’d be better advised to look for ways to remove the recursion or, more specifically, limit the depth of recursion that your program needs.

Readers who are familiar with Lisp, Scheme, or functional-programming languages must in particular be aware that Python does not implement the optimization of “tail-call elimination,” which is so important in these languages. In Python, any call, recursive or not, has the same cost in terms of both time and memory space, dependent only on the number of arguments: the cost does not change, whether the call is a “tail-call” (meaning that the call is the last operation that the caller executes) or any other, nontail call.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required