The operation of a Python program hinges on the data it handles. All data values in Python are represented by objects, and each object, or value, has a type. An object’s type determines what operations the object supports, or, in other words, what operations you can perform on the data value. The type also determines the object’s attributes and items (if any) and whether the object can be altered. An object that can be altered is known as a mutable object, while one that cannot be altered is an immutable object. I cover object attributes and items in detail later in this chapter.
The built-in
type(
obj
)
accepts any object as its argument and returns the type object that
represents the type of obj
. Another
built-in function,
isinstance(
obj
,type
)
,
returns True
if object
obj
is represented by type object
type
; otherwise, it returns
False
(built-in names True
and
False
were introduced in Python 2.2.1; in older
versions, 1
and 0
are used
instead).
Python has built-in objects for fundamental data types such as numbers, strings, tuples, lists, and dictionaries, as covered in the following sections. You can also create user-defined objects, known as classes, as discussed in detail in Chapter 5.
The built-in number objects in Python support integers (plain and long), floating-point numbers, and complex numbers. All numbers in Python are immutable objects, meaning that when you perform an operation on a number object, you always produce a new number object. Operations on numbers, called arithmetic operations, are covered later in this chapter.
Integer
literals can be decimal, octal, or hexadecimal. A decimal literal is
represented by a sequence of digits where the first digit is
non-zero. An octal literal is specified with a 0
followed by a sequence of octal digits (0
to
7
). To indicate a hexadecimal literal, use
0x
followed by a sequence of hexadecimal digits
(0
to 9
and
A
to F
, in either upper- or
lowercase). For example:
1, 23, 3493 # Decimal integers 01, 027, 06645 # Octal integers 0x1, 0x17, 0xDA5 # Hexadecimal integers
Any kind of integer literal may be followed by the letter
L
or l
to denote a long
integer. For instance:
1L, 23L, 99999333493L # Long decimal integers 01L, 027L, 01351033136165L # Long octal integers 0x1L, 0x17L, 0x17486CBC75L # Long hexadecimal integers
Use uppercase L
here, not lowercase
l
, which may look like the digit
1
. The difference between a long integer and a
plain integer is that a long integer has no predefined size limit: it
may be as large as memory allows. A plain integer takes up a few
bytes of memory and has minimum and maximum values that are dictated
by machine architecture. sys.maxint
is the largest
available plain integer, while -sys.maxint-1
is
the largest negative one. On typical 32-bit machines,
sys.maxint
is 2147483647.
A
floating-point literal is represented by a sequence of decimal digits
that includes a decimal point (.), an exponent
part (an e
or E
, optionally
followed by +
or -
, followed by
one or more digits), or both. The leading character of a
floating-point literal cannot be e
or
E
: it may be any digit or a period
(.) (prior to Python 2.2, a leading
0
had to be immediately followed by a period). For
example:
0., 0.0, .0, 1., 1.0, 1e0, 1.e0, 1.0e0
A Python floating-point value corresponds to a C
double
and shares its limits of range and
precision, typically 53 bits of precision on modern platforms.
(Python currently offers no way to find out this range and
precision.)
A complex number is made up of two floating-point values, one each
for the real and imaginary parts. You can access the parts of a
complex object z
as read-only attributes
z.real
and z.imag
. You can
specify an imaginary literal as a floating-point or decimal literal
followed by a j
or J
:
0j, 0.j, 0.0j, .0j, 1j, 1.j, 1.0j, 1e0j, 1.e0j, 1.0e0j
The j
at the end of the literal indicates the
square root of -1
, as commonly used in electrical
engineering (some other disciplines use i
for this
purpose, but Python has chosen j
). There are no
other complex literals; constant complex numbers are denoted by
adding or subtracting a floating-point literal and an imaginary one.
Note that numeric literals do not include a sign: a leading
+
or -
, if present, is a
separate operator, as discussed later in this chapter.
A
sequence
is an ordered container of items,
indexed by non-negative integers. Python provides built-in sequence
types for strings (plain and Unicode), tuples, and lists. Library and
extension modules provide other sequence types, and you can write yet
others yourself (as discussed in Chapter 5).
Sequences can be manipulated in a variety of ways, as discussed later
in this chapter.
A built-in string object is an ordered collection of characters used to store and represent text-based information. Strings in Python are immutable, meaning that when you perform an operation on a string, you always produce a new string object rather than mutating the existing string. String objects provide numerous methods, as discussed in detail in Chapter 9.
A string literal can be quoted or triple-quoted. A quoted string is a sequence of zero or more characters enclosed in matching quote characters, single (') or double (“). For example:
'This is a literal string' "This is another string"
The two different kinds of quotes function identically; having both
allows you to include one kind of quote inside of a string specified
with the other kind without needing to escape them with the backslash
character (\
):
'I\'m a Python fanatic' # a quote can be escaped "I'm a Python fanatic" # this way is more readable
To have a string span multiple lines, you can use a backslash as the last character of the line to indicate that the next line is a continuation:
"A not very long string\ that spans two lines" # comment not allowed on previous line
To make the string output on two lines, you must embed a newline in the string:
"A not very long string\n\ that prints on two lines" # comment not allowed on previous line
Another approach is to use a triple-quoted string, which is enclosed
by matching triplets of quote characters (''
' or
""
“):
"""An even bigger string that spans three lines""" # comments not allowed on previous lines
In a triple-quoted string literal, line breaks in the literal are preserved as newline characters in the resulting string object.
The only character that cannot be part of a triple-quoted string is an unescaped backslash, while a quoted string cannot contain an unescaped backslash, a line-end, and the quote character that encloses it. The backslash character starts an escape sequence, which lets you introduce any character in either kind of string. Python’s string escape sequences are listed in Table 4-1.
Table 4-1. String escape sequences
A variant of a string literal is a raw
string. The syntax is the same as for quoted or triple-quoted string
literals, except that an r
or R
immediately precedes the leading quote. In raw strings, escape
sequences are not interpreted as in Table 4-1, but
are literally copied into the string, including backslashes and
newline characters. Raw string syntax is handy for strings that
include many backslashes, as in regular expressions (see Chapter 9). A raw string cannot end with an odd number
of backslashes: the last one would be taken as escaping the
terminating quote.
Unicode string literals have the same
syntax as other string literals, plus a u
or
U
immediately before the leading quote character.
Unicode string literals can use \u
followed by
four hexadecimal digits to denote Unicode characters, and can also
include the kinds of escape sequences listed in Table 4-1. Unicode literals can also include the escape
sequence
\N{
name
}
,
where name
is a standard Unicode name as
per the list at http://www.unicode.org/charts/. For example,
\N{Copyright
Sign}
indicates a
Unicode copyright sign character (©). Raw Unicode string
literals start with ur
, not ru
.
Multiple string literals of any kind (quoted, triple-quoted, raw, Unicode) can be adjacent, with optional whitespace in between. The compiler concatenates such adjacent string literals into a single string object. If any literal in the concatenation is Unicode, the whole result is Unicode. Writing a long string literal in this way lets you present it readably across multiple physical lines, and gives you an opportunity to insert comments about parts of the string. For example:
marypop = ('supercalifragilistic' # Open paren -> logical line continues 'expialidocious') # Indentation ignored in continuation
The result here is a single word of 34 characters.
A tuple is an immutable ordered sequence of items. The items of a tuple are arbitrary objects and may be of different types. To specify a tuple, use a series of expressions (the items of the tuple) separated by commas (,). You may optionally place a redundant comma after the last item. You may group tuple items with parentheses, but the parentheses are needed only where the commas would otherwise have another meaning (e.g., in function calls) or to denote empty or nested tuples. A tuple with exactly two items is also often called a pair. To create a tuple of one item (a singleton), add a comma to the end of the expression. An empty tuple is denoted by an empty pair of parentheses. Here are some tuples, all enclosed in optional parentheses:
(100,200,300) # Tuple with three items (3.14,) # Tuple with one item ( ) # Empty tuple
You can also call the built-in tuple
to create a
tuple. For example:
tuple('wow')
This builds a tuple equal to:
('w', 'o', 'w')
tuple( )
without arguments creates and returns an
empty tuple. When x
is a sequence,
tuple(
x
)
returns a tuple whose items are the same as the items in sequence
x
.
A
list is a mutable ordered sequence of items. The
items of a list are arbitrary objects and may be of different types.
To specify a list, use a series of expressions (the
items of the list) separated by commas
(,) and within brackets ([ ]
).
You may optionally place a redundant comma after the last item. An
empty list is denoted by an empty pair of brackets. Here are some
example lists:
[42,3.14,'hello'] # List with three items [100] # List with one item [ ] # Empty list
You can also call the built-in list
to create a
list. For example:
list('wow')
This builds a list equal to:
['w', 'o', 'w']
list( )
without arguments creates and returns an
empty list. When x
is a sequence,
list(
x
)
creates and returns a new list whose items are the same as the items
in sequence x
. You can also build lists
with list comprehensions, as discussed later in this
chapter.
A
mapping
is an arbitrary collection of objects
indexed by nearly arbitrary values called keys.
Mappings are mutable and, unlike sequences, are unordered.
Python provides a single built-in mapping type, the dictionary type.
Library and extension modules provide other mapping types, and you
can write others yourself (as discussed in Chapter 5). Keys in a dictionary may be of different
types, but they must be hashable
(see function
hash in Section 8.2 in Chapter 8). Values
in a dictionary are arbitrary objects and may be of different types.
An item
in a dictionary is a key/value pair. You
can think of a dictionary as an associative array (also known in some
other languages as a hash).
To
specify a dictionary, use a series of pairs of expressions (the pairs
are the items of the dictionary) separated by commas
(,) within braces ({ }
). You
may optionally place a redundant comma after the last item. Each item
in a dictionary is written
key
:value
,
where key
is an expression giving the
item’s key and value
is
an expression giving the item’s value. If a key
appears more than once in a dictionary, only one of the items with
that key is kept in the dictionary. In other words, dictionaries do
not allow duplicate keys. An empty dictionary is denoted by an empty
pair of braces. Here are some dictionaries:
{ 'x':42, 'y':3.14, 'z':7 } # Dictionary with three items and string keys { 1:2, 3:4 } # Dictionary with two items and integer keys { } # Empty dictionary
In Python 2.2 and up, you can call the built-in
dict
to create a dictionary. For example:
dict([[1,2],[3,4]])
This builds a dictionary equal to:
{1:2,3:4}
dict( )
without arguments creates and returns an
empty dictionary. When the argument x
to
dict
is a mapping, dict
returns
a new dictionary object with the same keys and values as
x
. When x
is a
sequence, the items in x
must be pairs,
and
dict(
x
)
returns a dictionary whose items (key/value pairs) are the same as
the items in sequence x
. If a key appears
more than once in x
, only the last item
with that key is kept in the resulting dictionary.
The built-in type
None
denotes a null object.
None
has no methods or other attributes. You can
use None
as a placeholder when you need a
reference but you don’t care about what object you
refer to, or when you need to indicate that no object is there.
Functions return None
as their result unless they
have specific return statements coded to return other values.
In Python, callable types are those whose instances support the function call operation (see Section 4.4 later in this chapter). Functions are obviously callable, and Python provides built-in functions (see Chapter 8) and also supports user-defined functions (see Section 4.10 later in this chapter). Generators, which are new as of Python 2.2, are also callable (see Section 4.10.8 later in this chapter).
Types are also callable. Thus, the dict
,
list
, and tuple
built-ins
discussed earlier are in fact types. Prior to Python 2.2, these names
referred to factory functions for creating objects of these types. As
of Python 2.2, however, they refer to the type objects themselves.
Since types are callable, this change does not break existing
programs. See Chapter 8 for a complete list of
built-in types.
As we’ll discuss in Chapter 5,
class objects are callable. So are methods, which are functions bound
to class attributes. Finally, class instances whose classes supply
__call__
methods are also callable.
Prior to Python 2.3, there is no explicit
Boolean type in Python. However, every data value in Python can be
evaluated as a truth value: true or false. Any non-zero number or
non-empty string, tuple, list, or dictionary evaluates as true. Zero
(of any numeric type), None
, and empty strings,
tuples, lists, and dictionaries evaluate as false. Python also has a
number of built-in functions that return Boolean results.
Built-in names True
and False
were introduced in Python 2.2.1 to represent true and false; in older
versions of Python, 1
and 0
are
used instead. Throughout the rest of this book, I will use
True
and False
to represent
true and false. If you are using a version of Python older than
2.2.1, you’ll need to substitute
1
and 0
when using examples
from this book.
Python 2.2.1 also introduced a new built-in function named
bool
. When this function is called with any
argument, it considers the argument’s value in a
Boolean context and returns False
or
True
accordingly.
In Python 2.3, bool
becomes a type (a subclass of
int
) and True
and
False
are the values of that type. The only
substantial effect of this innovation is that the string
representations of Boolean values become 'True
'
and 'False
', while in earlier versions they are
'1
' and '0
‘.
The 2.2.1 and 2.3 changes are handy because they let you speak of
functions and expressions as “returning
True
or False
"
or “returning a Boolean.” The
changes also let you write clearer code when you want to return a
truth value (e.g., return
True
instead of return
1
).
Get Python in a Nutshell now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.