Chapter 4. Modules and Functions

The three basic building blocks of a Python program are modules, functions, and classes. This chapter will discuss modules and functions, while the next chapter will discuss classes. A Python module is a collection of statements that define variables, functions, and classes, and that is the primary unit of a Python program for the purposes of importing code. Importing a Python module actually executes the module. A function in Python is similar to functions or methods in most programming languages. Python offers a rich and flexible set of mechanisms for passing values to functions.

Modules

Python helps you to organize your programs by using modules. You can split your code among several modules, and the modules can be further organized into packages. Modules are the structural units of Python programs. In Java, this structural role is played directly by classes, and there is a strict correspondence between a file’s name and the class it contains. In Python, the filename is used only for organization, and does not require specific names to be used for any object within that file.

A Python module corresponds to a source code file containing a series of top-level statements, which are most often definitions of functions and classes. These statements are executed in sequence when the module is loaded. A module is loaded either by being passed as the main script when the interpreter is invoked or when the module is first imported by another module. This is the typical execution model for scripting languages, where you can take a recipe and easily try it out, but in Python the recipe can be more elegantly expressed using an object-oriented vocabulary.

At runtime, modules will appear as first-class namespace objects. A namespace is a dictionary-like mapping between identifiers and objects that is used for variable name lookup. Because modules are first-class objects, they can be bound to variable names, passed as arguments to functions, and returned as the result of a function.

Modules are used as the global lexical scope for all statements in the module file—there is no single “global” scope in Python. The value of a variable binding referenced by a statement within a module is determined by looking in the module namespace, or within a local namespace defined within the module (by a function or class definition, for example). The contents of a module are first set up by the top-level statements, which create bindings between a name and a value. Name-binding statements include assignment, import, function definition, and class definition statements. So, although any kind of statement can appear at the top level in a module (there is no such thing as a statement that can appear only at the top level), name-binding statements play a pivotal role.

Unlike Java, all name bindings in Python take place at runtime, and only as the direct consequence of a name-binding statement. This is a straightforward model that is slightly different from the Java model, especially because Python import statements have very different behavior from their Java counterparts. (Jython does have a compilation phase, but it is transparent to the user and no Python name bindings are set there. Jython compiles Python source code to Java bytecode, which is then dynamically loaded and creates the Python bindings at runtime.) A Java import statement merely allows a specific class or classes to be used with unqualified names, while a Python import statement actually executes the imported file and makes available all the names defined with it. A name binding in a module is available to other statements in the module after the binding is created. The dot operator module.attribute is used to access variable names bound in an imported module.

One consequence of the module and import semantics of Python is that Python does not force you to use object-oriented programming. You can mix procedural-style or functional programming in your modules when it makes sense to do so. Although it has been said that procedural programming does not make for very reusable code, this is true mostly for large programs written in statically typed programming languages. However, Python is a dynamically typed language, and because functions have the same first-class status of all other values, even straight procedural code can be reusable. With Python’s dynamically typed functions, you have the benefits of a generic programming paradigm such as C++ function templates, but with a simpler and more powerful model. In Python, it is easy to write common utility algorithms that take functions as arguments. Functions of this sort are sometimes called “pluggable,” and are very easy to reuse.

In the next sections, we will cover how to write function definition statements and how to use functions and import statements for Python modules and Java classes. Object-oriented class definition and semantics will be discussed in Chapter 5. The other name-binding statements, including the assignment statement and the for statement, are covered in Assignment and Loops in Chapter 3.

Functions

The simplest form of a function definition is as follows:

def funid([arg,...]):
    block

At runtime this will create a function object and bind it to the name funid. When called, the function object will execute the block of statements, which can refer to the variables defined in the argument list. These will be bound to the values of the actual arguments as computed when the function is called. Functions are called in the usual way, through the ( ) call operator:

callable-object([expr,...])

The calling convention, pass-by-value, is the same as for objects in Java, where the value being passed is the object reference, not the underlying object (this convention is sometimes called pass-by-object-reference). This means that changes to the underlying object will be visible outside the function; however, changes to the variable binding (such as reassigning it within the function) will not be visible outside the function. As in Java, there is no direct support for pure call-by-reference.

There is nothing magical about function objects in Python that enables them to be called using the call operator. As we will see in Special Methods in Chapter 5, any Python class can be defined to respond to the call operator, essentially allowing an instance of that class to mimic a function object.

Because Python is a dynamically typed language, there are no type declarations for the arguments or the return value in a function definition statement. Python does not support function overloading, as Java or C++ do (but see Parameter Passing with Style later in this chapter for the Python equivalent). A subsequent function definition for funid, like any other name-binding statement, will simply rebind funid without triggering an error. This is true even if the later binding is just a variable assignment—you cannot have functions and variables with the same name within a Python namespace. On the other hand, a function in Python is fully generic, in that any set of arguments can be passed to it. If the objects passed as arguments do not support the operations performed inside the function, a runtime error will be raised.

All functions return a value, unless they are abandoned because of a raised exception. A return statement:

return [expr]

can be used to force the return of control to the caller and specify the return value as the one computed by expr. In the case of a bare return without an expression, or if the end of a function is reached without returning a value explicitly, the function will return the value None.

In Python, it is also possible to return multiple values from a function by building a tuple on the fly—for example, return head, tail. Then you can use unpacking assignment at the call site or work directly with the tuple.

You also do not need to declare local variables in Python, but of course there are local variables. A variable is treated as local in a function (and more generally as local in any scope) if it is bound for the first time through any of the name-binding statements within the function or scope. Arguments are also local, and they are implicitly bound at call time. You will find more information on scoping rules in the Scoping Rules section, later in this chapter.

Tip

CPython 2.2 introduces a special case of function called a generator. A generator is defined the same way as a regular function, but instead of a return statement, it uses the new keyword yield expr. When a generator is called, it returns a generator object. The generator object supports the iterator protocol. When that iterator’s next function is invoked, it executes until it encounters a yield statement, at which time it returns the value of the expression in that statement. When the generator object is invoked again from the same scope, it continues execution from the point of the yield, essentially saving the state of all its local variables. The generator runs until it encounters a yield again, or until it exits normally. Generator functions can be placed anywhere iterators can, including the righthand side of a for statement. A generator is called repeatedly until it either raises a StopIter exception, or exits normally.

For example, the following simple generator returns an increasing range of integers one at a time:

def generateInts(N):
    for i in range(N):
        yield i

Parameter Passing with Style

Python supports many useful features related to parameter passing, through fancier argument specifiers. All these features are absent from Java.

First, you can specify a default value for an argument, which makes the argument optional. Just add the value after the arg in the argument list like so: arg=expr (by stylistic convention, you do not put spaces around the equals sign in the argument list). The expression is evaluated once and only once when the function definition statement is executed. It is not re-executed every time the function is called. If the expression value is a mutable object (e.g., a list), it is shared by all the function invocations. Therefore, you need to be careful, because any changes you make to the object in place (such as using append( )) are then visible to future function calls. Here is some code that shows a default argument in action.

a = 2
def func(x=a):
    print x

func( )
func(1)
a = 3
func( )

2
1
2

In this example, the rebinding of a = 3 does not affect the binding in x=a in the function statement, because the x=a binding was executed first and is not re-executed on each function call. A typical idiom that you can use to cope with problems that can arise when trying to use an optional, mutable list argument is to create a new copy of the argument each time:

def manip(..., l=None):
    if l is None:
        l = [1,2,3]
    ...

A default value can depend only on variable names that are valid in the namespace when the function is defined—it cannot depend on the other arguments to the function. If you make an argument optional, all subsequent arguments in the function must also have default values.

Python has a richer syntax than Java for passing arguments when a function is called. At call time, you can include any argument in the call using the same arg=expr style. For example, the function func in the previous example could be called as func(x=3). This is called a keyword argument. Keyword arguments can be in any order (but must come after the standard arguments), and their expressions are passed to the appropriate argument.

The keyword argument syntax works for all user-defined functions, but unfortunately does not work for many of the Python built-in functions. You cannot use keyword arguments when calling a Java method from Jython, since ordinary Java compilation causes the loss of variable name information (there is a partial exception for constructors; see Using Beans in Jython in Chapter 8). If you have experience with calls to heavily overloaded Java methods with many arguments, you can see that appropriate use of defaults and keyword arguments can increase code readability and clarity for your Python code.

If you end a function’s argument list with a name that has a * in front of it, such as *rest, rest captures in a tuple any excess arguments passed to the function. By using this syntax, a function can take a variable number of arguments. You can also have a second catch-all name at the end of your list, with ** in front. The double-star argument captures in a dictionary any keyword arguments that are not already specified in the argument list. If both of these argument types exist in the function, the tuple argument must come first.

Here is a summary of the complete function definition syntax:

def funid([arg,...[,*rest[,**kwargs]]])
    block

The following function will be used to make the syntax clearer:

def func(a, b=0, c="fred", *d, **e):
    print a, b, c, d, e

Ordinary arguments are bound left to right and defaults are filled in. Notice that the catch-all arguments are empty:

func(1, 2, 3)
func(1, 2)
func(1)

1 2 3 ( ) {}
1 2 fred ( ) {}
1 0 fred ( ) {}

Keyword arguments are explicitly bound to the named argument after the ordinary arguments are bound from left to right. At the end of the call, all arguments (except the catch-alls) must have either a default, an ordinary argument, or a keyword argument. In the preceding code, the argument a must be bound either with a keyword argument, or by having the call start with an ordinary argument:

func(1, c=3)
func(1, 2, c=3)
func(b=2, a=1)

1 0 3 ( ) {}
1 2 3 ( ) {}
1 2 fred ( ) {}

Finally, assuming that all the listed arguments are filled, the catch-all arguments grab any extras:

func(1, 2, 3, 4, 5, 6)
func(1, 2, c=12, f="hi", g="there")
func(1, 2, 3, 4, g="there")

1 2 3 (4, 5, 6) {}
1 2 12 ( ) {'g': 'there', 'f': 'hi'}
1 2 3 (4,) {'g': 'there'}

Scoping Rules

Function definitions can be nested. The block of statements in a function definition (or class definition) are placed in a local scope in the same way that top-level statements are placed in the global scope. Ordinary control-flow statements and list comprehensions do not introduce new scopes.

From the beginning of time until Version 2.1, Python variable names were resolved this way:

  1. If the name is bound by some name-binding statement in the current scope, all usage of that name in the scope refers to the binding in the local scope. This is enforced during the transparent compilation phase. If such a name is used at runtime before it is actually bound, this produces an error.

  2. Otherwise, a possible binding of the name in the global scope is checked (at runtime) and if such a binding exists, this is used.

  3. If there is no global binding, Python looks in the built-in namespace (which corresponds to the built-in module __builtin__). If this also fails, an error is issued.

Under these rules, there are only three scopes: local, global, and __builtin__. Scopes do not nest. Therefore, a function whose definition is nested inside another function cannot refer to itself recursively because its name is bound in the enclosing namespace, which is in neither the inner function’s scope nor the global or built-in scopes. In addition, names in the enclosing namespace, but not in the global namespace, also cannot be used. The following code shows the potential for “gotchas.”

def outerFunc(x, y):
    def innerFunc(z):
        if z > 0:
            print z, y
            innerFunc(z - 1)
    innerFunc(x)

outerFunc(3, "fred")

This code has two name errors that a Java programmer may not be expecting. The use of y in the print statement is a name error because y is only defined in the outerFunc scope, not in the local or global scope. The usual workaround in this case is to change the definition of innerFunc to def innerFunc(z, y=y):, which works but is undeniably awkward. Even with that workaround, the next line containing the call to innerFunc is also a name error for the same reason. In practice, this is not much of an issue (unless you use lambda expressions a lot, there is rarely a reason why functions need to be nested in Python).

With Version 2.2 of CPython, however, new rules will replace the old ones, allowing access from one scope to binding in the enclosing scopes in the way that a Java programmer would expect. The new rules can already be activated in Jython 2.1 and CPython 2.1 on a per-module basis, putting the following __future__ statement before any other statement in the module:[7]

from __future__ import nested_scopes

With the new rules, if a name is locally bound in a scope by some statement in that scope, then every use in the scope refers to this binding. If not, the binding is called free.

A free name, when used, refers to the binding in the nearest enclosing scope that contains a binding for that name, ignoring scopes introduced by class definition. If no such explicit binding exists at compile time, Python tries to resolve the name at runtime, first in the global scope and then in the built-in namespace.

The practical meaning of the new rule is that when it is created, an inner function gets a frozen copy of any referenced outer bindings (identifiers only—the values are not copied) as they exist at that time. With these rules, it is still impossible for the inner scope code to modify those outer bindings; using a name-binding statement simply creates a new local binding. In this way, Python diverges from most other languages with nested scopes. Under the new rules, the function outerFunc will work perfectly.

Under both sets of rules, you can always force a name to refer to the global/built-in binding, even if it occurs in a more local name-binding statement, by using the global declaration statement:

global name[,...]

Flying First Class

We have already mentioned more than once that Python functions are first-class objects. They can be stored in variables, returned from other functions, and passed around the same as any other object. The function definition statement simply creates such an object and binds it to a name.

It is also possible to create a function object without binding it to a name using the lambda operator, which was briefly introduced in Functional Programming in Chapter 2. The syntax of a lambda is a little different from the def statement:

lambda args: expr

The argument list args has the same structure as the argument specifier of a def statement. Because lambda is an operator, not a statement, it can appear inside an expression and produces an anonymous function object. This function can be called like any other function. When called, it evaluates the expression expr using the actual values of the arguments and returns the obtained result. Therefore, the following bits of code are equivalent:

fun_holder = lambda args: expr

def fun_holder(args):
    return expr

And both versions are called using the syntax:

fun_holder(args)

We will use functions as first-class objects often throughout this book. However, the full implications for program design of having first-class functions are beyond our scope. We’ll limit ourselves to some more reference material and two examples.

Python offers a built-in function that enables you to dynamically call a function object (or any callable object) with a computed set of arguments. This is useful in the case where you do not even know exactly how many or which arguments will be used at compile time. It is also useful at times when your arguments have been calculated in a sequence or dictionary, but the function being called expects the arguments separately. The syntax is:

apply(function[, args [, kwargs]])

where args should be a sequence specifying the positional arguments, and kwargs is a dictionary for the keyword arguments. Both are optional. For example, use the same function we used before:

def func(a, b=0, c="fred", *d, **e):
    print a, b, c, d, e

samefunc = func
samefunc(1, 2, 3)

t = (1, 2, 3)
kw1 = { "g": "hi" }

apply(samefunc, t)                   #equivalent to samefunc(1, 2, 3)
apply(samefunc, t[:2], kw1)            #equivalent to samefunc(1, 2, g="hi")
apply(samefunc, (1,), {"b": "hi"})   #equivalent to samefunc(1, b="hi")

1 2 3 ( ) {}
1 2 3 ( ) {}
1 2 fred ( ) {'g': 'hi'}
1 hi fred ( ) {}

The sequence argument to apply is treated as though the arguments were listed one at a time and left to right. The dictionary argument to apply then works as though each key/value pair in the dictionary was called as a keyword argument.

In Jython 2.0 and later (CPython introduced this in 1.6), you can also pass sequences or dictionaries in the same “exploded” manner that apply uses by placing * for sequences or ** for dictionaries before the argument at call time. The arguments are applied to the function exactly as they would be in apply. So, the apply calls in the preceding code could also be written as:

samefunc(*t)
samefunc(*t[:2], **kw1)
samefunc(*(1,), **{"b": "hi"})

and would return the same results. This is purely syntactic sugar, but can sometimes be easier to read.

There is also a built-in module operator that defines corresponding functions for all Python operators, such as operator.add for the + operator. By combining apply and operator, you can mimic any set of static actions at runtime using dynamic functions or operators. The typical use of this module is to minimize the need to use lambda statements when using the reduce( ) function.

The following example shows how to construct a set of function objects dynamically for building some HTML text (in a rather simple-minded way) through the use of a helper function. The example shows how you might use nested function definitions, nested scopes, and first-class functions.

When the outer function is called, it constructs and returns a new function object. Each constructed function deals with a given tag, passed as a parameter to the helper function, and wraps its input between opening and closing versions of the tag, as defined in the outer function. Each constructed tag function can take as input any number of text fragments that will be concatenated and can set arbitrary attributes for the tag, through keyword arguments.

from __future__ import nested_scopes

def tag_fun_maker(tag):
    open = "<%s" % tag
    close = "</%s>" % tag
    def tagfunc(*content, **attrs):
        attrs = ['%s="%s"' % (key, value) for key, value in attrs.items( )]
        attrs = ' '.join(attrs)
        return "%s %s>%s%s" % (open,attrs,''.join(content),close)
    return tagfunc

html = tag_fun_maker("html")
body = tag_fun_maker("body")
strong = tag_fun_maker("strong")
anchor = tag_fun_maker("a")

print html(body(
    "\n",
    anchor(strong("Hello World from Jython!"),
    href="http://www.jython.org"),
    "\n"))
<html ><body >
<a href="http://www.jython.org"><strong >Hello World from Jython!</strong></a>
</body></html>

The main point of this example is that tag_fun_maker is able to create a function dynamically and return it. The example also parameterizes the created functions using the new nested scope rules, which cause the inner function to act as a lexical closure, able to refer to variables in the outer scopes as they existed when the function was created.

Here is an idiom borrowed from Smalltalk that you might call “do around.” Sometimes, you have a resource that needs to be opened, used, and closed in a variety of places, which might cause you to repeat the open and close logic frequently. A typical example is a file.

def fileLinesDo(fileName, lineFunction):
     file = open(fileName, 'r')
     result = [lineFunction(each) for each in file.readlines( )]
     file.close( )
     return result

The key here is that you don’t have to rewrite the open and close statements each time you use the file. This is a trivial point for files, but if we added error checking or had a more complicated resource, it would be very useful.

Import Statements and Packages

In Python, modules are either built-in (e.g., the sys module) or defined by external files, typically source files with a .py extension. Python comes with a set of external modules that form the Python standard library, which is mostly shared between Python and Jython. Most CPython modules written in Python work directly in Jython. The C extension modules for CPython that are written in C and not Python cannot be used from Jython, although some of these modules have been ported from C to Java.

External modules are retrieved from a path, which, like Java’s classpath, is a set of directories. In Python, the path is stored as a list of strings in the sys.path attribute of the sys module (e.g., ['', 'C:\\.', 'e:\\jython\\Lib', 'e:\\jython']), which by default should point at the current working directory and at the Python standard library directory. The path can be changed for all your Jython modules by editing the Jython registry (see Appendix B). Also, the sys.path variable can be changed dynamically by Python code.

Python modules can be organized in packages using the directory structure in a manner similar to Java. But a subdirectory of one of the directories in sys.path is considered a package only if it contains an __init__.py Python source file and if all its parent directories do as well, up to but excluding the one in sys.path (there is no concrete default/root package in Python). The names of the packages (directories) in the chain down to the bottommost package (subdirectory) are separated by dots to form the qualified name of the package (e.g., foo1.foo2.foo4). A module gets a qualified name by appending its name (the filename without the .py extension) to the qualified name of its parent package (a top-level module has no parent package).

Jython 2.1 will also allow .zip and .jar archive files to be placed on the Python path. The exact specification is still not complete as of this writing (and may change due to a proposal to put similar functionality in CPython 2.2). However, the basic idea is that the files and directories compressed in the archive would be treated exactly as though they were an uncompressed part of the filesystem.

At runtime, first-class module objects are created for modules and packages by loading their __init__.py modules. The __init__.py file can contain statements that initialize the package, however, it is often completely empty and serves just as a marker. This is different from Java, in which packages do not have first-class status.

When loading a module, Python ensures that all its parent packages are loaded as well. Loading for a module happens once, and the loaded module and package objects are cached in a dictionary stored in sys.modules with their qualified names as keys.

When loaded, a package/module is bound with its name in its parent package’s namespace (if there is a parent package). So, at runtime packages and modules are retrieved through normal attribute lookup. In Python, unlike Java, there is no name resolution against packages or of qualified names at compilation time. Name resolution takes place only at runtime. Requests for modules and packages to be loaded can be issued through import statements, which are name-binding statements executed at runtime.

When a module is loaded, and before the execution of its top-level code, the identifier __name__ is bound to the qualified name of the module in the module’s global namespace. This allows you to access the name by which the module is called by the outside world. The case of a module passed as main script by directly invoking the interpreter is special; in that case, __name__ is always set to '__main__', and not to the actual name of the module. The following idiom is typical:

if __name__ == '__main__':
  ... # code

and is often used to put some test code in a library module, or to give it a working mode as a standalone utility. This idiom is roughly analogous to the special main() function in Java.

Import Statements

Import statements always ensure that their targets are compiled and loaded along with all the packages and the qualified name of the target. Moreover, except in the case of the main module called by invoking the interpreter, Jython catches the results of the transparent compilation that takes place when a module foo.py is loaded in the class file foo$py.class.

Python offers two different import statements: import and from. They differ in how they place the imported module within the calling module’s namespace.

The syntax of import is:

import qualmod [as name][,...]

If qualmod specifies a bare module, this module is bound to its name in the current scope. Otherwise, the top package specified by qualmod is bound to its top-level name. For example, import foo.bar ensures that both the package foo and the module foo.bar are loaded and binds the name foo to the foo module object in the current scope. Then bar and its contents are accessible through attribute lookup. With as name, the target module (not the top package) is bound to name, and the top-level package is not bound at all.

The syntax of from is:

from qualmod import name1 [as alias1][,name2 ...]

The value of qualmod.name1 is bound to name1 in the current scope, or to alias1 if that is specified. The qualmod itself is not bound in the current scope. These semantics mean that any later rebinding of qualmod.name1 (for example, by reloading qualmod) will not affect name1 in importing scope. By using the from statement, the names imported are accessible directly in the calling module without using the dot operator.

There is a special form of from:

from qualmod import *

This statement takes all the bindings in qualmod whose names do not start with '_', and binds them to the very same name in the current scope. If qualmod.__all__ exists, it should be a list of strings, and then only the names listed there will be considered and bound.

At first, from statements may seem more convenient than import statements because the variable names are accessible directly, and you don’t have to continually type the module name. However, you do need to be careful when using from. It is possible that a from ... import * statement could rebind things that you don’t want rebound, such as the names of built-in functions. This is especially true if the from statement does not occur at the top of the module. Starting with Python 2.1, a correct __all__ attribute has been added to most modules in the standard library, but you should still pay attention to this problem with your own modules. In Python 2.2, or in Python 2.1 with nested scopes enabled, from ... import * statements are allowed only at the top level of a module because the result of the import is ambiguous in a nested scope if bindings from the imported module shadow an existing reference.

Also, because from statements create new name bindings separate from the module name bindings, you cannot see changes made in the imported module when it is reloaded. Although this is not usually a problem in running code, it can be a significant annoyance if you are developing using an interactive session—you will continually find that modules are not seeing other module changes.

Both import and from work by calling a built-in special function __import__, which looks like this:

__import__(moduleName[, globals [, locals [, fromlist]]])

Internally, Python converts import and from statements to an __import__ call. The arguments are a string for the fully qualified name of the module to be imported, dictionaries of current global and local namespace bindings, and the list of items to import, if the statement is a from.[8] If the call is from ... import *, the last argument is ['*']. Then the function imports the module represented by moduleName and returns the top-level package if fromlist is empty, and the bottom-level package if it is not. The actual binding of names is left to Python and is not performed by the function.

You can call this function directly in your programs to dynamically control module import—for example, to load modules one at a time to a test suite. You can even substitute this function with your own custom import function by rebinding __builtin__.__import__. In Jython, this can also be used to import Java classes and packages.

Importing Java Classes

Jython also allows access to Java classes and packages through the import statements. Jython is able to load classes through the underlying Java Virtual Machine (JVM), both from the Java classpath and from the directories in sys.path. Conceptually, you can think that for the purpose of loading Java classes, the directories in sys.path have been appended to the classpath. This means there is no need to mark Java packages on sys.path with __init__.py modules, because that would make them Python packages.

Python packages and modules take precedence over Java packages. On the other hand, Java classes in a shadowed Java package can still be loaded from the shadowing Python module.

Because it is not possible to ask the JVM directly through a Java API on which Java packages can be potentially loaded, Jython scans the available .jar files at startup and scans the candidate directories for Java packages and classes at runtime. The information obtained from .jar files is cached between sessions. All this is done to properly interpret import statements that can trigger loading of Java classes or packages. You’ll notice that if you add a .jar file to your classpath, then the next time you start Jython, you’ll see a message that the new file has been identified.

In Java, import statements are a matter of name resolution at compilation time. We have already seen that the Python model is different; import statements, even for packages, bind names to concrete objects at runtime. To fit Java loading in this overall model, Jython creates module-like, unique concrete objects (instances of the internal org.python.core.PyJavaPackage class) for Java packages.

A statement such as:

import java.lang

will bind java in the current scope to the module-like object for the Java package java. Moreover, this namespace object will map lang to the java.lang package. You can then access the String class by referring to it as java.lang.String. The Java feature of having the java.lang classes automatically imported does not work in Jython. However, the statement import java will give access to any class in the standard Java library, provided you fully qualify the name (such as java.util.List).

Java classes are wrapped by Jython in objects that mimic both Python-like class behavior and the original Java class behavior. A binding to this wrapper object will be set up by Jython in the module object for the package when import is requested.

All this has implications for the overhead of a statement such as from javapkg import *. This kind of statement creates wrappers for all the classes in javapkg and binds to them both in the current scope and in the javapkg namespace. Although technically, in this special case the wrappers do not load the Java classes immediately, but rather, lazily as needed. However, it is not a good idea to use from javapkg import * for Java packages liberally in production code, as one might avoid using import javapkg.* in Java (actually, that’s against the Sun coding guidelines in Java, as well). The from javapakg import * statement can be a useful feature, for example, when experimenting in an interactive session.

Auto-loading through lookup

In Java you can always refer to a class by using its fully qualified name without the need for a previous import statement. Given what we have explained about Python import statements, this is not true in Jython, but Jython does offer a shortcut.

If you need to refer to your.favorite.UsefulClass, you do not need an import your.favorite.UsefulClass first. You can simply import the top package your with import your, and then you can use attribute lookup to reach your.favorite.UsefulClass and also your.favorite.InvisibleClass, and so on. For example, you can get access to most of the Java classes by merely using the statement import java. Subpackages can then be accessed using the dot operator.

import java
x = java.util.Vector
print x

This syntax works because in Jython, attribute lookup on a Java package object triggers loading as necessary. Jython behaves the same way for the import of Python packages, but it should be noted that this feature is not offered by CPython, so it is not portable.

Reload

A nice feature of Python is its support for dynamically reloading modules through the reload built-in function:

reload(mod)

The reload function takes a module mod and executes the top-level statements in its (possibly changed) corresponding file, reusing mod as global scope, after it has been emptied. The function then returns the possibly altered and reinitialized mod. Reload is most frequently used during development, when you might be continually testing a module as you change its source. To have the Python interactive interpreter recognize the changes, you need to reload the module. Reload can also be useful in the context of an application that needs to be dynamically reconfigured or upgraded at runtime.

It should be noted that reload does not operate recursively on the modules imported by mod, and that all the bindings to old values originally from mod in other modules remain in place. Code has access to the new values after reload only if it uses attribute lookup. Specifically, a module that imported some values from mod through from ... import * will keep the unaffected values. Also, instances of a class defined in mod will not be affected by the possibly changed definition. Jython also ships with the jreload module that offers some support for reloading Java classes.



[7] Future statements such as from __future__ import feature[,...] are both import statements and directives to the compiler to activate features that will become mandatory in a future release—part of the Python strategy for gradually introducing new features. They should appear before any conventional statements.

[8] The dictionaries are ignored by the built-in version, except for the global bindings used to retrieve the importing package’s __name__. This name information is used to implement package-relative imports, a feature whose usage is discouraged. If you call the function directly and want to supply a fromlist, {} is a fine placeholder for both.

Get Jython Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.