Most
statements in a typical Python program are organized into functions.
A function
is a group of statements that
executes upon request. Python provides many built-in functions and
allows programmers to define their own functions. A request to
execute a function is known as a function
call
. When a function is called, it may be
passed arguments that specify data upon which the function performs
its computation. In Python, a function always returns a result value,
either None
or a value that represents the results
of its computation. Functions defined within class
statements are also called methods
. Issues
specific to methods are covered in Chapter 5; the
general coverage of functions in this section, however, also applies
to methods.
In Python, functions are objects (values) and are handled like other
objects. Thus, you can pass a function as an argument in a call to
another function. Similarly, a function can return another function
as the result of a call. A function, just like any other object, can
be bound to a variable, an item in a container, or an attribute of an
object. Functions can also be keys into a dictionary. For example, if
you need to quickly find a function’s inverse given
the function, you could define a dictionary whose keys and values are
functions and then make the dictionary bidirectional (using some
functions from module math
, covered in Chapter 15):
inverse = {sin:asin, cos:acos, tan:atan, log:exp} for f in inverse.keys( ): inverse[inverse[f]] = f
The fact that functions are objects in Python is often expressed by saying that functions are first-class objects.
The
def
statement is the most common way to define a
function. def
is a single-clause compound
statement with the following syntax:
deffunction-name
(parameters
):statement(s)
function-name
is an identifier. It is a
variable that gets bound (or rebound) to the function object when
def
executes.
parameters
is an optional list of identifiers,
called formal
parameters or
just parameters, that are used to represent values that are supplied
as arguments when the function is called. In the simplest case, a
function doesn’t have any formal parameters, which
means the function doesn’t take any arguments when
it is called. In this case, the function definition has empty
parentheses following function-name
.
When a function does take arguments,
parameters
contains one or more
identifiers, separated by commas (,). In this
case, each call to the function supplies values, known as
arguments
, that correspond to the parameters
specified in the function definition. The parameters are local
variables of the function, as we’ll discuss later in
this section, and each call to the function binds these local
variables to the corresponding values that the caller supplies as
arguments.
The non-empty sequence of statements, known as the
function
body
, does not
execute when the def
statement executes. Rather,
the function body executes later, each time the function is called.
The function body can contain zero or more occurrences of the
return
statement, as we’ll
discuss shortly.
Here’s an example of a simple function that returns a value that is double the value passed to it:
def double(x): return x*2
Formal parameters that are simple identifiers indicate mandatory parameters. Each call to the function must supply a corresponding value (argument) for each mandatory parameter.
In the comma-separated list of parameters, zero or more mandatory parameters may be followed by zero or more optional parameters, where each optional parameter has the syntax:
identifier
=expression
The def
statement evaluates the
expression
and saves a reference to the
value returned by the expression, called the
default
value
for the
parameter, among the attributes of the function object. When a
function call does not supply an argument corresponding to an
optional parameter, the call binds the parameter’s
identifier to its default value for that execution of the function.
Note that the same object, the default value, gets bound to the optional parameter whenever the caller does not supply a corresponding argument. This can be tricky when the default value is a mutable object and the function body alters the parameter. For example:
def f(x, y=[ ]): y.append(x) return y print f(23) # prints: [23] prinf f(42) # prints: [23,42]
The second print
statement prints
[23,42]
because the first call to
f
altered the default value of
y
, originally an empty list [ ]
, by appending 23
to it. If you want
y
to be bound to a new empty list object each time
f
is called with a single argument, use the
following:
def f(x, y=None): if y is None: y = [ ] y.append(x) return y print f(23) # prints: [23] prinf f(42) # prints: [42]
At the end of the formal parameters, you may optionally use either or
both of the special forms
*
identifier1
and
**
identifier2
. If both
are present, the one with two asterisks must be last.
*
identifier1
indicates
that any call to the function may supply extra positional arguments,
while **
identifier2
specifies that any call to the function may supply extra named
arguments (positional and named arguments are covered later in this
chapter). Every call to the function binds
identifier1
to a tuple whose items are the
extra positional arguments (or the empty tuple, if there are none).
identifier2
is bound to a dictionary whose
items are the names and values of the extra named arguments (or the
empty dictionary, if there are none). Here’s how to
write a function that accepts any number of arguments and returns
their sum:
def sum(*numbers): result = 0 for number in numbers: result += number return result print sum(23,42) # prints: 65
The **
form also lets you construct a dictionary
with string keys in a more readable fashion than with the standard
dictionary creation syntax:
def adict(**kwds): return kwds print adict(a=23, b=42) # prints: {'a':23, 'b':42}
Note that the body of function adict
is just one
simple statement, and therefore we can exercise the option to put it
on the same line as the def
statement. Of course,
it would be just as correct (and arguably more readable) to code
function adict
using two lines instead of one:
def adict(**kwds): return kwds
The
def
statement defines some attributes of a
function object. The attribute func_name
, also
accessible as __name__
, is a read-only attribute
(trying to rebind or unbind it raises a runtime exception) that
refers to the identifier used as the function name in the
def
statement. The attribute
func_defaults
, which you may rebind or unbind,
refers to the tuple of default values for the optional parameters (or
the empty tuple, if the function has no optional
parameters).
Another function
attribute is the documentation
string
, also known as a
docstring
. You may use or rebind a
function’s docstring attribute as either
func_doc
or __doc__
. If the
first statement in the function body is a string literal, the
compiler binds that string as the function’s
docstring attribute. A similar rule applies to classes (see Chapter 5) and modules (see Chapter 7). Docstrings most often span multiple physical
lines, and are therefore normally specified in triple-quoted string
literal form. For example:
def sum(*numbers): '''Accept arbitrary numerical arguments and return their sum. The arguments are zero or more numbers. The result is their sum.''' result = 0 for number in numbers: result += number return result
Documentation strings should be part of any Python code you write.
They play a role similar to that of comments in any programming
language, but their applicability is wider since they are available
at runtime. Development environments and other tools may use
docstrings from function, class, and module objects to remind the
programmer how to use those objects. The doctest
module (covered in Chapter 17) makes it easy to
check that the sample code in docstrings is accurate and correct.
To make your docstrings as useful as possible, you should respect a
few simple conventions. The first line of a docstring should be a
concise summary of the function’s purpose, starting
with an uppercase letter and ending with a period. It should not
mention the function’s name, unless the name happens
to be a natural-language word that comes naturally as part of a good,
concise summary of the function’s operation. If the
docstring is multiline, the second line should be empty, and the
following lines should form one or more paragraphs, separated by
empty lines, describing the function’s expected
arguments, preconditions, return value, and side effects (if any).
Further explanations, bibliographical references, and usage examples
(to be checked with doctest
) can optionally follow
toward the end of the docstring.
In addition to its predefined attributes, a function object may be
given arbitrary attributes. To create an attribute of a function
object, bind a value to the appropriate attribute references in an
assignment statement after the def
statement has
executed. For example, a function could count how many times it is
called:
def counter( ): counter.count += 1 return counter.count counter.count = 0
Note that this is not common usage. More often, when you want to group together some state (data) and some behavior (code), you should use the object-oriented mechanisms covered in Chapter 5. However, the ability to associate arbitrary attributes with a function can sometimes come in handy.
The return
statement in Python is allowed only inside a function body, and it
can optionally be followed by an expression. When
return
executes, the function terminates and the
value of the expression is returned. A function returns
None
if it terminates by reaching the end of its
body or by executing a return
statement that has
no expression.
As a matter of style, you should not write a
return
statement without an expression at the end
of a function body. If some return
statements in a
function have an expression, all return
statements
should have an expression. return
None
should only be written explicitly to meet
this style requirement. Python does not enforce these stylistic
conventions, but your code will be clearer and more readable if you
follow them.
A function call is an expression with the following syntax:
function-object
(arguments
)
function-object
may be any reference to a
function object; it is most often the function’s
name. The parentheses denote the function-call operation itself.
arguments
, in the simplest case, is a
series of zero or more expressions separated by commas
(,), giving values for the
function’s corresponding formal parameters. When a
function is called, the parameters are bound to these values, the
function body executes, and the value of the function-call expression
is whatever the function returns.
In
traditional terms, all argument passing in Python is by
value
. For example, if a variable is passed as an
argument, Python passes to the function the object (value) to which
the variable currently refers, not the variable itself. Thus, a
function cannot rebind the caller’s variables.
However, if a mutable object is passed as an argument, the function
may request changes to that object since Python passes the object
itself, not a copy. Rebinding a variable and mutating an object are
totally different concepts in Python. For example:
def f(x, y): x = 23 y.append(42) a = 77 b = [99] f(a, b) print a, b # prints: 77 [99, 42]
The print
statement shows that
a
is still bound to 77
.
Function f
’s rebinding of its
parameter x
to 23
has no effect
on f
’s caller, and in particular
on the binding of the caller’s variable, which
happened to be used to pass 77
as the
parameter’s value. However, the
print
statement also shows that
b
is now bound to [99,42]
.
b
is still bound to the same list object as before
the call, but that object has mutated, as f
has
appended 42
to that list object. In either case,
f
has not altered the caller’s
bindings, nor can f
alter the number
77
, as numbers are immutable. However,
f
can alter a list object, as list objects are
mutable. In this example, f
does mutate the list
object that the caller passes to f
as the second
argument by calling the object’s
append
method.
Arguments that are just expressions are called
positional
arguments. Each
positional argument supplies the value for the formal parameter that
corresponds to it by position (order) in the function
definition.
In a function call, zero or more positional arguments may be followed
by zero or more named
arguments with the following syntax:
identifier
=expression
The identifier
must be one of the formal
parameter names used in the def
statement for the
function. The expression
supplies the
value for the formal parameter of that name.
A function call must supply, via either a positional or a named argument, exactly one value for each mandatory parameter, and zero or one value for each optional parameter. For example:
def divide(divisor, dividend): return dividend // divisor print divide(12,94) # prints: 7 print divide(dividend=94, divisor=12) # prints: 7
As you can see, the two calls to divide
are
equivalent. You can pass named arguments for readability purposes
when you think that identifying the role of each argument and
controlling the order of arguments enhances your
code’s clarity.
A more common use of named arguments is to bind some optional parameters to specific values, while letting other optional parameters take their default values:
def f(middle, begin='init', end='finis'): return begin+middle+end print f('tini', end='') # prints: inittini
Thanks to named argument end='
', the caller can
specify a value, the empty string '', for
f
’s third parameter,
end
, and still let
f
’s second parameter,
begin
, use its default value, the string
'init
‘.
At the end of the arguments in a function call, you may optionally
use either or both of the special forms
*
seq
and
**
dict
. If both are
present, the one with two asterisks must be last.
*
seq
passes the items
of seq
to the function as positional
arguments (after the normal positional arguments, if any, that the
call gives with the usual simple syntax).
seq
may be any sequence or iterable.
**
dict
passes the items
of dict
to the function as named
arguments, where dict
must be a dictionary
whose keys are all strings. Each item’s key is a
parameter name, and the item’s value is the
argument’s value.
Sometimes you want to pass an argument of the form
*
seq
or
**
dict
when the formal
parameters use similar forms, as described earlier under Section 4.10.2. For example, using the
function sum
defined in that section (and shown
again here), you may want to print the sum of all the values in
dictionary d
. This is easy with
*
seq
:
def sum(*numbers): result = 0 for number in numbers: result += number return result print sum(*d.values( ))
However, you may also pass arguments of the form
*
seq
or
**
dict
when calling a
function that does not use similar forms in its formal
parameters.
A
function’s formal parameters, plus any variables
that are bound (by assignment or by other binding statements) in the
function body, comprise the function’s
local
namespace
, also known
as local
scope
. Each of
these variables is called a local
variable
of the
function.
Variables
that are not local are known as global
variables (in the absence of nested definitions,
which we’ll discuss shortly). Global variables are
attributes of the module object, as covered in Chapter 7. If a local variable in a function has the
same name as a global variable, whenever that name is mentioned in
the function body, the local variable, not the global variable, is
used. This idea is expressed by saying that the local variable hides
the global variable of the same name throughout the function body.
By default, any variable that is bound within a function body is a local variable of the function. If a function needs to rebind some global variables, the first statement of the function must be:
global identifiers
where identifiers
is one or more
identifiers separated by commas (,). The
identifiers listed in a global
statement refer to
the global variables (i.e., attributes of the module object) that the
function needs to rebind. For example, the function
counter
that we saw in
Section 4.10.3
could be implemented using global
and a global
variable rather than an attribute of the function object as follows:
_count = 0 def counter( ): global _count _count += 1 return _count
Without the global
statement, the
counter
function would raise an
UnboundLocalError
exception because
_count
would be an uninitialized (unbound) local
variable. Note also that while the global
statement does enable this kind of programming, it is neither elegant
nor advisable. As I mentioned earlier, when you want to group
together some state and some behavior, the object-oriented mechanisms
covered in Chapter 5 are typically the best
approach.
You don’t need global
if the
function body simply uses a global variable, including changing the
object bound to that variable if the object is mutable. You need to
use a global
statement only if the function body
rebinds a global variable. As a matter of style, you should not use
global
unless it’s strictly
necessary, as its presence will cause readers of your program to
assume the statement is there for some useful purpose.
A def
statement within
a function body defines a nested
function
, and the function whose body includes
the def
is known as an outer
function
to the nested one. Code in a nested
function’s body may access (but not rebind) local
variables of an outer function, also known as
free
variables
of the
nested function. This nested-scope access is automatic in Python 2.2
and later. To request nested-scope access in Python 2.1, the first
statement of the module must be:
from __future__ import nested_scopes
The simplest way to let a nested function access a value is often not to rely on nested scopes, but rather to explicitly pass that value as one of the function’s arguments. The argument’s value can be bound when the nested function is defined by using the value as the default for an optional argument. For example:
def percent1(a, b, c): # works with any version def pc(x, total=a+b+c): return (x*100.0) / total print "Percentages are ", pc(a), pc(b), pc(c)
Here’s the same functionality using nested scopes:
def percent2(a, b, c): # needs 2.2 or "from __future__ import" def pc(x): return (x*100.0) / (a+b+c) print "Percentages are", pc(a), pc(b), pc(c)
In this specific case, percent1
has a slight
advantage: the computation of
a
+
b
+
c
happens only once, while
percent2
’s inner function
pc
repeats the computation three times. However,
if the outer function were rebinding its local variables between
calls to the nested function, repeating this computation might be an
advantage. It’s therefore advisable to be aware of
both approaches, and choose the most appropriate one case by case.
A nested function that accesses values from outer local variables is
known as a closure
. The following example shows
how to build a closure without nested scopes (using a default value):
def make_adder_1(augend): # works with any version def add(addend, _augend=augend): return addend+_augend return add
Here’s the same closure functionality using nested scopes:
def make_adder_2(augend): # needs 2.2 or "from __future__ import" def add(addend): return addend+augend return add
Closures are an exception to the general rule that the
object-oriented mechanisms covered in Chapter 5
are the best way to bundle together data and code. When you need to
construct callable objects, with some parameters fixed at object
construction time, closures can be simpler and more effective than
classes. For example, the result of
make_adder_1(7)
is a function that accepts a
single argument and adds 7
to that argument (the
result of make_adder_2(7)
behaves in just the same
way). You can also express the same idea as lambda
x
:
x
+7
, using the
lambda
form covered in the next section. A closure
is a “factory” for any member of a
family of functions distinguished by some parameters, such as the
value of argument augend
in the previous
examples, and this may often help you avoid code
duplication.
If a function body contains a single
return
expression
statement, you may choose to replace the function with the special
lambda
expression form:
lambdaparameters
:expression
A lambda
expression is the anonymous equivalent of
a normal function whose body is a single return
statement. Note that the lambda
syntax does not
use the return
keyword. You can use a
lambda
expression wherever you would use a
reference to a function. lambda
can sometimes be
handy when you want to use a simple function as an argument or return
value. Here’s an example that uses a
lambda
expression as an argument to the built-in
filter
function:
aList = [1,2,3,4,5,6,7,8,9] low = 3 high = 7 filter(lambda x,l=low,h=high: h>x>l, aList) # returns: [4, 5, 6]
As an alternative, you can always use a local def
statement that gives the function object a name. You can then use
this name as the argument or return value. Here’s
the same filter
example using a local
def
statement:
aList = [1,2,3,4,5,6,7,8,9] low = 3 high = 7 def test(value, l=low, h=high): return h>value>l filter(test, aList) # returns: [4, 5, 6]
When the body
of a function contains one or more occurrences of the keyword
yield
, the function is called a
generator
. When a generator is called, the
function body does not execute. Instead, calling the generator
returns a special iterator object that wraps the function body, the
set of its local variables (including its parameters), and the
current point of execution, which is initially the start of the
function.
When the next
method of this iterator object is
called, the function body executes up to the next
yield
statement, which takes the form:
yield expression
When a yield
statement executes, the function is
frozen with its execution state and local variables intact, and the
expression following yield
is returned as the
result of the next
method. On the next call to
next
, execution of the function body resumes where
it left off, again up to the next yield
statement.
If the function body ends or executes a return
statement, the iterator raises a StopException
to
indicate that the iterator is finished. Note that
return
statements in a generator cannot contain
expressions, as that is a syntax error.
yield
is always a keyword in Python 2.3 and later.
In Python 2.2, to make yield
a keyword in a source
file, use the following line as the first statement in the file:
from __future__ import generators
In Python 2.1 and earlier, you cannot define generators.
Generators are often handy ways to build iterators. Since the most
common way to use an iterator is to loop on it with a
for
statement, you typically call a generator like
this:
foravariable
insomegenerator
(arguments
):
For example, say that you want a sequence of numbers counting up from
1
to N
and then down to
1
again. A generator helps:
def updown(N): for x in xrange(1,N): yield x for x in xrange(N,0,-1): yield x for i in updown(3): print i # prints: 1 2 3 2 1
Here is a generator that works somewhat like the built-in
xrange
function, but returns a sequence of
floating-point values instead of a sequence of integers:
def frange(start, stop, step=1.0): while start < stop: yield start start += step
frange
is only somewhat like
xrange
, because, for simplicity, it makes
arguments start
and stop
mandatory, and silently assumes step
is positive
(by default, like xrange
,
frange
makes step
equal to
1
).
Generators are more flexible than functions that return lists. A
generator may build an iterator that returns an infinite stream of
results that is usable only in loops that terminate by other means
(e.g., via a break
statement). Further, the
generator-built iterator performs lazy evaluation: the iterator computes each successive item
only when and if needed, just in time, while the equivalent function
does all computations in advance and may require large amounts of
memory to hold the results list. Therefore, in Python 2.2 and later,
if all you need is the ability to iterate on a computed sequence, it
is often best to compute the sequence in a generator, rather than in
a function that returns a list. If the caller needs a list that
contains all the items produced by a generator
G
(
arguments
)
,
the caller can use the following code:
resulting_list
= list(G
(arguments
))
Python supports recursion (i.e., a
Python function can call itself), but there is a limit to how deep
the recursion can be. By default, Python interrupts recursion and
raises a RecursionLimitExceeded
exception (covered
in Chapter 6) when it detects that the stack of
recursive calls has gone over a depth of 1,000. You can change the
recursion limit with function setrecursionlimit
of
module sys
, covered in Chapter 8.
However, changing this limit will still not give you unlimited
recursion; the absolute maximum limit depends on the platform,
particularly on the underlying operating system and C runtime
library, but it’s typically a few thousand. When
recursive calls get too deep, your program will crash. Runaway
recursion after a call to setrecursionlimit
that
exceeds the platform’s capabilities is one of the
very few ways a Python program can crash—really crash, hard,
without the usual safety net of Python’s exception
mechanisms. Therefore, be wary of trying to fix a program that is
getting RecursionLimitExceeded
exceptions by
raising the recursion limit too high with
setrecursionlimit
. Most often,
you’d be better advised to look for ways to remove
the recursion or, at least, to limit the depth of recursion that your
program needs.
Get Python in a Nutshell now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.