If you’ve bought this book, you may already know what Python is, and why it’s an important tool to learn. If not, you probably won’t be sold on Python until you’ve learned the language by reading the rest of this book and have done a project or two. But before jumping into details, the first few pages briefly introduce some of the main reasons behind Python’s popularity. To begin sculpting a definition of Python, this chapter takes the form of a question and answer session, which poses some of the most common non-technical questions asked by beginners.
Because there are many programming languages available today, this is the usual first question of newcomers. Given the hundreds of thousands of Python users out there today, there really is no way to answer this question with complete accuracy. The choice of development tools is sometimes based on unique constraints or personal preference.
But after teaching Python to roughly one thousand students and almost 100 companies in recent years, some common themes have emerged. The primary factors cited by Python users seem to be these:
For many, Python’s focus on readability, coherence, and software quality in general, sets it apart from “kitchen sink” style languages like Perl. Python code is designed to be readable, and hence maintainable—much more so than traditional scripting languages. In addition, Python has deep support for software reuse mechanisms such as object oriented programming (OOP).
Python boosts developer productivity many times beyond compiled or statically typed languages such as C, C++, and Java. Python code is typically 1/3 to 1/5 the size of equivalent C++ or Java code. That means there is less to type, less to debug, and less to maintain after the fact. Python programs also run immediately, without the lengthy compile and link steps of some other tools.
Most Python programs run unchanged on all major computer platforms. Porting Python code between Unix and Windows, for example, is usually just a matter of copying a script’s code between machines. Moreover, Python offers multiple options for coding portable graphical user interfaces.
Python comes with a large collection of prebuilt and portable functionality, known as the standard library. This library supports an array of application-level programming tasks, from text pattern matching, to network scripting. In addition, Python can be extended with both home-grown libraries, as well as a vast collection of third-party application support software.
Python scripts can easily communicate with other parts of an application, using a variety of integration mechanisms. Such integrations allow Python to be used as a product customization and extension tool. Today, Python code can invoke C and C++ libraries, can be called from C and C++ programs, can integrate with Java components, can communicate over COM, Corba, and .NET, and can interact over networks with interfaces like SOAP and XML-RPC.
Because of Python’s ease of use and built-in toolset, it can make the act of programming more pleasure than chore. Although this may be an intangible benefit, it’s effect on productivity at large is an important asset.
Of these factors, the first two, quality and productivity, are probably the most compelling benefits to most Python users.
By design, Python implements both a deliberately simple and readable syntax, and a highly coherent programming model. As a slogan at a recent Python conference attests, the net result is that Python seems to just “fit your brain”—that is, features of the language interact in consistent and limited ways, and follow naturally from a small set of core concepts. This makes the language easier to learn, understand, and remember. In practice, Python programmers do not need to constantly refer to manuals when reading or writing code; it’s an orthogonal design.
By philosophy, Python adopts a somewhat minimalist approach. This means that although there are usually multiple ways to accomplish a coding task, there is usually just one obvious way, a few less obvious alternatives, and a small set of coherent interactions everywhere in the language. Moreover, Python doesn’t make arbitrary decisions for you; when interactions are ambiguous, explicit intervention is preferred over “magic.” In the Python way of thinking, explicit is better than implicit, and simple is better than complex.
Beyond such design themes, Python includes tools such as modules and OOP that naturally promote code reusability. And because Python is focused on quality, so too, naturally, are Python programmers.
During the great Internet boom of the mid-to-late 1990s, it was difficult to find enough programmers to implement software projects; developers were asked to implement systems as fast as the Internet evolved. Now, in the post-boom era of layoffs and economic recession, the picture has shifted. Today, programming staffs are forced to accomplish the same tasks with fewer people.
In both of these scenarios, Python has shined as a tool that allows programmers to get more done with less effort. It is deliberately optimized for speed of development— its simple syntax, dynamic typing, lack of compile steps, and built-in toolset allow programmers to develop programs in a fraction of the development time needed for some other tools. The net effect is that Python typically boosts developer productivity many times beyond that of traditional languages. That’s good news both in boom times and bust.
Python is a general purpose programming language that is often applied in scripting roles. It is commonly defined as an object-oriented scripting language—a definition that blends support for OOP with an overall orientation toward scripting roles. In fact, people often use the word “script” instead of “program” to describe a Python code file. In this book, the terms “script” and “program” are used interchangeably, with a slight preference for “script” to describe a simpler top-level file and “program” to refer to a more sophisticated multifile application.
Because the term “scripting” has so many different meanings to different observers, some would prefer that it not be applied to Python at all. In fact, people tend to think of three very different definitions when they hear Python labeled a “scripting” language, some of which are more useful than others:
Tools for coding operating system-oriented scripts. Such programs are often launched from console command-lines, and perform tasks such as processing text files and launching other programs. Python programs can serve such roles, but this is just one of dozens of common Python application domains. It is not just a better shell script language.
A “glue” layer used to control and direct (i.e., script) other application components. Python programs are indeed often deployed in the context of a larger application. For instance, to test hardware devices, Python programs may call out to components that give low-level access to a device. Similarly, programs may run bits of Python code at strategic points, to support end-user product customization, without having to ship and recompile the entire system’s source code. Python’s simplicity makes it a naturally flexible control tool. Technically, though, this is also just a common Python role; many Python programmers code standalone scripts, without ever using or knowing about any integrated components.
A simple language used for coding tasks quickly. This is probably the best way to think of Python as a scripting language. Python allows programs to be developed much quicker than compiled languages like C++. Its rapid development cycle fosters an exploratory, incremental mode of programming that has to be experienced to be appreciated. Don’t be fooled, though—Python is not just for simple tasks. Rather, it makes tasks simple, by its ease of use and flexibility. Python has a simple feature set, but allows programs to scale up in sophistication as needed.
So, is Python a scripting language or not? It depends on whom you ask. In general, the term scripting is probably best used to describe the rapid and flexible mode of development that Python supports, rather than a particular application domain.
We’ll talk about implementation concepts later in this book. But in short, the standard implementations of Python today compile (i.e., translate) source code statements to an intermediate format known as byte code, and then interpret the byte code. Byte code provides portability, as it is a platform-independent format. However, because Python is not compiled all the way down to binary machine code (e.g., instructions for an Intel chip), some programs will run more slowly in Python than in a fully compiled language like C.
Whether you will ever care about the execution speed difference depends on what kinds of programs you write. Python has been optimized numerous times, and Python code runs fast enough by itself in most application domains. Furthermore, whenever you do something “real” in a Python script, like process a file or construct a GUI, your program is actually running at C speed since such tasks are immediately dispatched to compiled C code inside the Python interpreter. More fundamentally, Python’s speed-of-development gain is often far more important than any speed-of-execution loss, especially given modern computer speeds.
Even at today’s CPU speeds there still are some domains that do require optimal execution speed. Numeric programming and animation, for example, often need at least their core number-crunching components to run at C speed (or better). If you work in such a domain, you can still use Python—simply split off the parts of the application that require optimal speed into compiled extensions , and link those into your system for use in Python scripts.
We won’t talk about extensions much in this text, but this is really just an instance of the Python-as-control-language role that we discussed earlier. A prime example of this dual language strategy is the NumPy numeric programming extension for Python; by combining compiled and optimized numeric extension libraries with the Python language, NumPy turns Python into a numeric programming tool that is both efficient and easy to use. You may never need to code such extensions in your own Python work, but they provide a powerful optimization mechanism if you ever do.
At this writing, in 2003, the best estimate anyone can seem to make of the size of the Python user base is that there are between 500,000 and 1 million Python users around the world today (plus or minus a few). This estimate is based on various statistics like downloads and comparative newsgroup traffic. Because Python is open source, a more exact count is difficult—there are no license registrations to tally. Moreover, Python is automatically included with Linux distributions and some products and computer hardware, further clouding the user base picture. In general, though, Python enjoys a large user base, and a very active developer community. Because Python has been around for over a decade and has been widely used, it is also very stable and robust.
Besides individual users, Python is also being applied in real revenue-generating products, by real companies. For instance, Google and Yahoo! currently use Python in Internet services; Hewlett-Packard, Seagate, and IBM use Python for hardware testing; Industrial Light and Magic and other companies use Python in the production of movie animation; and so on. Probably the only common thread behind companies using Python today is that Python is used all over the map, in terms of application domains. Its general purpose nature makes it applicable to almost all fields, not just one. For more details on companies using Python today, see Python’s web site at http://www.python.org.
Besides being a well-designed programming language, Python is also useful for accomplishing real world tasks—the sorts of things developers do day in and day out. It’s commonly used in a variety of domains, as a tool for both scripting other components and implementing standalone programs. In fact, as a general purpose language, Python’s roles are virtually unlimited.
However, the most common Python roles today seem to fall into a few broad categories. The next few sections describe some of Python’s most common applications today, as well as tools used in each domain. We won’t be able to describe all the tools mentioned here; if you are interested in any of these topics, see Python online or other resources for more details.
Python’s built-in interfaces to operating-system services make it ideal for writing portable, maintainable system-administration tools and utilities (sometimes called shell tools). Python programs can search files and directory trees, launch other programs, do parallel processing with processes and threads, and so on.
Python’s standard library comes with POSIX bindings, and support for all the usual OS tools: environment variables, files, sockets, pipes, processes, multiple threads, regular expression pattern matching, command-line arguments, standard stream interfaces, shell-command launchers, filename expansion, and more. In addition, the bulk of Python’s system interfaces are designed to be portable; for example, a script that copies directory trees typically runs unchanged on all major Python platforms.
Python’s simplicity and rapid turnaround also make it a good match for GUI (graphical user interface) programming. Python comes with a standard object-oriented interface to the Tk GUI API called Tkinter, which allows Python programs to implement portable GUIs with native look and feel. Python/Tkinter GUIs run unchanged on MS Windows, X Windows (on Unix and Linux), and Macs. A free extension package, PMW, adds advanced widgets to the base Tkinter toolkit. In addition, the wxPython GUI API, based on a C++ library, offers an alternative toolkit for constructing portable GUIs in Python.
Higher-level toolkits such as PythonCard and PMW are built on top of base APIs such as wxPython and Tkinter. With the proper library, you can also use other GUI toolkits in Python such as Qt, GTK, MFC, and Swing. For applications that run in web browsers or have simple interface requirements, both Jython and Python server-side CGI scripts provide additional user interface options.
Python comes with standard Internet modules that allow Python programs to perform a wide variety of networking tasks, in both client and server modes. Scripts can communicate over sockets; extract form information sent to a server-side CGI script; transfer files by FTP; process XML files; send, receive, and parse email; fetch web pages by URLs; parse the HTML and XML of fetched web pages; communicate over XML-RPC, SOAP, and telnet; and more. Python’s libraries make these tasks remarkably simple.
We discussed the component integration role earlier, when describing Python as a control language. Python’s ability to be extended by and embedded in C and C++ systems makes it useful as a flexile glue language, for scripting the behavior of other systems and components. For instance, by integrating a C library into Python, Python can test and launch its components. And by embedding Python in a product, on-site customizations can be coded without having to recompile the entire product, or ship its source code at all.
Tools such as the SWIG code generator can automate much of the work needed to link compiled components into Python for use in scripts. And larger frameworks such as Python’s COM support on MS Windows, the Jython Java-based implementation, the Python.NET system, and various CORBA toolkits for Python provide alternative ways to script components. On Windows, for example, Python scripts can use frameworks to script MS Word and Excel, and serve the same sorts of roles as Visual Basic.
module provides a simple object
system—it allows programs to easily save and restore entire
Python objects to files and file-like objects. For more traditional
database demands, there are Python interfaces to Sybase, Oracle,
Informix, ODBC, MySQL, and more.
The Python world has also defined a portable database API for accessing SQL database systems from Python scripts, which looks the same on a variety of underlying database systems. For instance, because vendor interfaces implement the portable API, a script written to work with the free MySQL system will work largely unchanged on other systems such as Oracle by simply replacing the underlying vendor interface. On the web, you’ll also find a third-party system named gadfly that implements a SQL database for Python programs, a complete object-oriented database system called ZODB.
To Python programs, components written in Python and C look the same. Because of this, it’s possible to prototype systems in Python initially and then move components to a compiled language such as C or C++ for delivery. Unlike some prototyping tools, Python doesn’t require a complete rewrite once the prototype has solidified. Parts of the system that don’t require the efficiency of a language such as C++ can remain coded in Python for ease of maintenance and use.
The NumPy numeric programming extension for Python mentioned earlier includes such advanced tools as an array object, interfaces to standard mathematical libraries, and much more. By integrating Python with numeric routines coded in a compiled language for speed, NumPy turns Python into a sophisticated yet easy-to-use numeric programming tool, which can often replace existing code written in traditional compiled languages such as FORTRAN or C++. Additional numeric tools for Python support animation, 3D visualization, and so on.
Python is commonly applied in more domains than can be mentioned
here. For example, you can do graphics and game programming in Python
with the pygame system; image processing with
the PIL package and others;
AI programming with neural network
simulators and expert system shells; XML parsing with the
xml library package, the
xmlrpclib module, and third-party extensions; and
even play solitaire with the PySol program.
You’ll find support for many such fields at the
Vaults of Parnassus web site (linked from http://www.python.org). (The Vaults of
Parnassus is a large collection of links to third-party software for
Python programming. If you need to do something special with Python,
the Vaults is usually the best first place to look for resources.)
In general, many of these specific domains are largely just instances of Python’s component integration role in action again. By adding Python as a frontend to libraries of components written in a compiled language such as C, Python becomes useful for scripting in a wide variety of domains. As a general purpose language that supports integration, Python is widely applicable.
Naturally, this is a developer’s question. If you don’t already have a programming background, the words in the next few sections may be a bit baffling—don’t worry, we’ll explain all of these in more detail as we proceed through this book. For de-velopers, though, here is a quick introduction to some of Python’s top technical features.
Python is an object-oriented language, from the ground up. Its class model supports advanced notions such as polymorphism, operator overloading, and multiple inheritance; yet in the context of Python’s simple syntax and typing, OOP is remarkably easy to apply. In fact, if you don’t understand these terms, you’ll find they are much easier to learn with Python than with just about any other OOP language available.
Besides serving as a powerful code structuring and reuse device, Python’s OOP nature makes it ideal as a scripting tool for object-oriented systems languages such as C++ and Java. For example, with the appropriate glue code, Python programs can subclass (specialize) classes implemented in C++ or Java. Of equal significance, OOP is an option in Python; you can go far without having to become an object guru all at once.
Python is free. Just like other open source software, such as Tcl, Perl, Linux, and Apache, you can get the entire Python system for free on the Internet. There are no restrictions on copying it, embedding it in your systems, or shipping it with your products. In fact, you can even sell Python’s source code, if you are so inclined.
But don’t get the wrong idea: “free” doesn’t mean “unsupported.” On the contrary, the Python online community responds to user queries with a speed that most commercial software vendors would do well to notice. Moreover, because Python comes with complete source code, it empowers developers, and creates a large team of implementation experts. Although studying or changing a programming language’s implementation isn’t everyone’s idea of fun, it’s comforting to know that it’s available as a final resort and ultimate documentation source. You’re not dependent on a commercial vendor.
Python development is performed by a community, which largely coordinates its efforts over the Internet. It consists of Python’s creator—Guido van Rossum, the officially anointed Benevolent Dictator For Life (BDFL) of Python—plus a cast of thousands. Language changes must both follow a formal enhancement procedure (known as the PEP process), and be scrutinized by the BDFL. Happily, this tends to make Python more conservative with changes than some other languages.
The standard implementation of Python is written in portable ANSI C, and compiles and runs on virtually every major platform in use today. For example, Python programs run today on everything from PDAs to supercomputers. As a partial list, Python is available on Unix systems, Linux, MS-DOS, MS Windows (95, 98, NT, 2000, XP, etc.), Macintosh (classic and OS X), Amiga, AtariST, Be-OS, OS/2, VMS, QNX, Vxworks, PalmOS, PocketPC and CE, Cray supercomputers, IBM mainframes, PDAs running Linux, and more.
Besides the language interpreter itself, the set of standard library modules that ship with Python are also implemented to be as portable across platform boundaries as possible. Further, Python programs are automatically compiled to portable byte code, which runs the same on any platform with a compatible version of Python installed (more on this in the next chapter).
What that means is that Python programs using the core language and standard libraries run the same on Unix, MS Windows, and most other systems with a Python interpreter. Most Python ports also contain platform-specific extensions (e.g., COM support on MS Windows), but the core Python language and libraries work the same everywhere. As mentioned earlier, Python also includes an interface to the Tk GUI toolkit called Tkinter, which allows Python programs to implement full-featured graphical user interfaces that run on all major GUI platforms without program changes.
From a features perspective, Python is something of a hybrid. Its tool set places it between traditional scripting languages (such as Tcl, Scheme, and Perl), and systems development languages (such as C, C++, and Java). Python provides all the simplicity and ease of use of a scripting language, along with more advanced software engineering tools typically found in compiled languages. Unlike some scripting languages, this combination makes Python useful for large-scale development projects. As a preview, here are some of the main things we’ll find in Python’s toolbox:
Python keeps track of the kinds of objects your program uses when it runs; it doesn’t require complicated type and size declarations in your code. In fact, as we’ll see in Chapter 4, there is no such thing as a type or variable declaration anywhere to be found in Python.
For building larger systems, Python includes tools such as modules, classes, and exceptions. These tools allow you to organize systems into components, use OOP to reuse and customize code, and handle events and errors gracefully.
Python provides commonly used data structures such as lists, dictionaries, and strings, as an intrinsic part of the language; as we’ll see, they’re both flexible and easy to use. For instance, built-in objects can grow and shrink on demand, can be arbitrarily nested to represent complex information, and more.
To process all those object types, Python comes with powerful and standard operations, including concatenation (joining collections), slicing (extracting sections), sorting, mapping, and more.
For more specific tasks, Python also comes with a large collection of pre-coded library tools that support everything from regular-expression matching to networking. Python’s library tools are where much of the application-level action occurs.
Because Python is freeware, it encourages developers to contribute precoded tools that support tasks beyond Python’s built-ins; you’ll find free support for COM, imaging, CORBA ORBs, XML, database vendors, and much more.
Despite the array of tools in Python, it retains a remarkably simple syntax and design. The result is a powerful programming tool, which retains the usability of a scripting language.
Python programs can be easily “glued” to components written in other languages, in a variety of ways. For example, Python’s C API lets C programs call and be called by Python programs flexibly. That means you can add functionality to the Python system as needed, and use Python programs within other environments or systems.
For example, by mixing Python with libraries coded in languages such as C or C++, it becomes an easy-to-use frontend language and customization tool. As mentioned earlier, this also makes Python good at rapid prototyping; systems may be implemented in Python first to leverage its speed of development, and later moved to C for delivery, one piece at a time, according to performance demands.
To run a Python program, you simply type it and run it. There are no intermediate compile and link steps like there are for languages such as C or C++. Python executes programs immediately, which makes for both an interactive programming experience and rapid turnaround after program changes.
Of course, development cycle turnaround is only one aspect of Python’s ease of use. It also provides a deliberately simple syntax and powerful high-level built-in tools. In fact, some have gone so far as to call Python “executable pseudocode.” Because it eliminates much of the complexity in other tools, Python programs are simpler, smaller, and more flexible than equivalent programs in language like C, C++, and Java.
This brings us to the topic of this book: compared to other programming languages, the core Python language is remarkably easy to learn. In fact, you can expect to be coding significant Python programs in a matter of days (and perhaps in just hours, if you’re already an experienced programmer). That’s good news both for professional developers seeking to learn the language to use on the job, as well as for end users of systems that expose a Python layer for customization or control. Today, many systems rely on the fact that end users can quickly learn enough Python to tailor their Python customization’s code onsite, with little or no support.
Finally, in terms of what you may already know, people sometimes compare Python to languages such as Perl, Tcl, and Java. We talked about performance earlier, so here the focus is on functionality. While other languages are also useful tools to know and use, we think that Python:
Is more powerful than Tcl. Python’s support for “programming in the large” makes it applicable to larger systems development.
Has a cleaner syntax and simpler design than Perl, which makes it more readable and maintainable, and helps reduce program bugs.
Is simpler and easier to use than Java. Python is a scripting language, but Java inherits much of the complexity of systems languages such as C++.
Is simpler and easier to use than C++, but often doesn’t compete with C++ either; as a scripting language, Python often serves different roles.
Is both more powerful and more cross-platform than Visual Basic. Its open source nature also means it is not controlled by a single company.
Has the dynamic flavor of languages like SmallTalk and Lisp, but also has a simple, traditional syntax accessible to developers and end users.
Especially for programs that do more than scan text files, and that might have to be read in the future by others (or by you!), we think Python fits the bill better than any other scripting language available today. Furthermore, unless your application requires peak performance, Python is often a viable alternative to systems development languages such as C, C++, and Java; Python code will be much less to write, debug, and maintain.
Of course, both of the authors are card-carrying Python evangelists, so take these comments as you may. They do, however, reflect the common experience of many developers who have taken time to explore what Python has to offer.
And that concludes the hype portion of this book. The best way to judge a language is to see it in action, so the next two chapters turn to a strictly technical introduction to the language. There, we explore ways to run Python programs, peek at Python’s byte code execution model, and introduce the basics of module files for saving your code. Our goal will be to give you just enough information to run the examples and exercises in the rest of the book. As mentioned earlier, you won’t really start programming until Chapter 4, but make sure you have a handle on the startup details before moving on.
 For a more complete look at
the Python philosophy, type the command
this at any Python interactive prompt
(you’ll see how in Chapter 2).
This invokes an easter egg hidden in Python, a collection of Python
design principles. The acronym EIBTI has lately become fashionable
for the “explicit is better than