Data Definition Language
The DDL is used to define the structure of data containers and objects
within the database. The most common of these containers and objects are
tables, indexes, and views. As you’ll see, most objects are defined with a
variation of the
CREATE command, such as
CREATE TABLE or
CREATE VIEW. The
DROP command is used to delete an existing object (and all
of the data it might contain). Examples include
DROP TABLE or
INDEX. Because the command syntax is so different, statements like
CREATE TABLE or
CREATE INDEX are usually considered to be
separate commands, and not variations of a single
In a sense, the DDL commands are similar to C/C++ header files. DDL commands are used to define the structure, names, and types of the data containers within a database, just as a header file typically defines type definitions, structures, classes, and other data structures. DDL commands are typically used to set up and configure a brand new database before data is entered.
DDL commands are often held in a script file, so that
the structure of the database can be easily recreated. Sometimes, especially
during development, you may need to recreate only part of the database. To
help support this, most
in SQLite have an optional
IF NOT EXISTS
statement will return an error if an object with the requested name already
exists. If the optional
IF NOT EXISTS
clause is present, then this error is suppressed and nothing is done, even
if the structure of the existing object and the new object are not
compatible. Similarly, most
statements allow an optional
clause that silently ignores any request to delete an object that isn’t
In the examples that follow, the
IF EXISTS and
NOT EXISTS command variations are not explicitly called out.
Please see the SQLite command reference in Appendix C for
the full details on the complete syntax supported by SQLite.
The most common DDL command is
TABLE. No data values can be stored in a database
until a table is defined to hold that data. At the bare minimum, the
CREATE TABLE command defines
the table name, plus the name of each column. Most of the time you’ll
want to define a type for each column as well, although types are
optional when using SQLite. Optional constraints, conditions, and
default values can also be assigned to each column. Table-level
constraints can also be assigned, if they involve multiple
In some large RDBMS systems,
CREATE TABLE can be quite complex,
defining all kinds of storage options and configurations. SQLite’s
CREATE TABLE is somewhat
simpler, but there are still a great many options available. For full
explanation of the command, see CREATE TABLE
in Appendix C.
column_type, [...] );
A table name must be provided to identify the new table. After that, there is just a simple list of column names and their types. Table names come from a global namespace of all identifiers, including the names of tables, views, and indexes.
Clear and concise identifier names are important. Like the database design itself, some careful thought should be put into the table name, trying to pin down the precise meaning of the data it contains. Much the same could be said of column names. Table names and column names tend to be referenced frequently in queries, so there is some desire to keep them as brief as possible while still keeping their purpose clear.
Most databases use strong, static column typing. This means that the elements of a column can only hold values compatible with the column’s defined type. SQLite utilizes a dynamic typing technique known as manifest typing. For each row value, manifest typing records the value’s type along with the value data. This allows nearly any element of any row to hold almost any type of value.
In the strictest sense, SQLite supports only five concrete datatypes. These are known as storage classes, and represent the different ways SQLite might choose to store data on disk. Every value has one of these five native storage classes:
A signed integer number. Integer values are variable length, being 1, 2, 3, 4, 6, or 8 bytes in length, depending on the minimum size required to hold the specific value. Integer have a range of −9,223,372,036,854,775,808 to +9,223,372,036,854,775,807, or roughly 19 digits. Literal integers are represented by any bare series of numeric digits (without commas) that does not include a decimal point or exponent.
A floating-point number, stored as an 8-byte value in the processor’s native format. For nearly all modern processors, this is an IEEE 754 double-precision number. Literal floating-point numbers are represented by any bare series of numeric digits that include a decimal point or exponent.
A length of raw bytes, copied exactly as provided. Literal BLOBs are represented as hexadecimal text strings preceded by an
x. For example, the notation
x'1234ABCD'represents a 4-byte BLOB. BLOB stands for Binary Large OBject.
SQLite text and BLOB values are always variable length. The maximum size of a text or BLOB value is limited by a compile-time directive. The default limit is exactly one billion bytes, or slightly less than a full gigabyte. The maximum value for this directive is two gigabytes.
Since the elements of most columns can hold
any value type, the “type” of a column is a somewhat misleading
concept. Rather than being an absolute type, as in most
databases, an SQLite column type (as defined in
CREATE TABLE) becomes more of a
suggestion than a hard and fast rule. This is known as
type affinity, and essentially represents
a desired category of type. Each type affinity has specific
rules about what types of values it can store, and how different
values will be converted when stored in that column. Generally,
a type affinity will cause a conversion or migration of types
only if it can be done without losing data or precision.
A column with a text affinity will only store values of type NULL, text, or BLOB. If you attempt to store a value with a numeric type (float or integer) it will be converted into a text representation before being stored as a text value type.
A column with a numeric affinity will store any of the five types. Values with integer and float types, along with NULL and BLOB types, are stored without conversion. Any time a value with a text type is stored, an attempt is made to convert the value to a numeric type (integer or float). Assuming the conversion works, the value is stored in an appropriate numeric type. If the conversion fails, the text value is stored without any type of conversion.
A column with an integer affinity works essentially the same as a numeric affinity. The only difference is that any value with a float type that lacks a fractional component will be converted into an integer type.
A column with a floating-point affinity also works essentially the same as a numeric affinity. The only difference is that most values with integer types are converted into floating-point values and stored as a float type.
Since type affinities are not part of the SQL standard, SQLite has a series of rules that attempt to map traditional column types to the most logical type affinity. The type affinity of a column is determined by the declared type of the column, according to the following rules (substring matches are case-insensitive):
As implied by the first rule, the column
type is completely optional. SQLite will allow you to create a
table by simply naming the columns, such as
CREATE TABLE t ( i, j, k);.
You’ll also notice that there isn’t any specific list of column
types that are recognized. You can use any column type you want,
even making up your own names.
This might sound a bit fast and loose for a typing system, but it works out quite well. By keying off specific substrings, rather than trying to define specific types, SQLite is able to handle SQL statements (and their database-specific types) from just about any database, all while doing a pretty good job of mapping the types to an appropriate affinity. About the only type you need to be careful of is “floating point.” The “int” in “point” will trigger rule 2 before the “floa” in “floating” will get to rule 5, and the column affinity will end up being integer.
In addition to column names and types, a table definition can also
impose constraints on specific columns or sets of columns. A
more complete view of the
TABLE command looks something like this:
column_constraints..., [... ,]
table_constraints, [...] );
Here we see that each individual column can have additional, optional constraints and modifiers placed on it. Column constraints only affect the column for which they are defined, while table constraints can be used to define a constraint across one or more columns. Constraints that involve two or more columns must be defined as table constraints.
Column constraints can be used to define a
custom sort order (
Collations determine how text values are sorted. In addition to
user-defined collations, SQLite includes a
NOCASE collation that ignores
case when sorting.
A default value (
DEFAULT ) can
also be assigned. Nearly all columns have a default of NULL. You
can use the
constraint to set a different default value. The default can
either be a literal or an expression. Expressions must be placed
To help with date and time defaults, SQLite
also includes three special keywords that may be used as a default value:
CURRENT_TIMESTAMP. These will
record the UTC time, date, or date and time, respectively, when
a new row is inserted. See Date and Time Features for
more information on date and time functions.
Column constraints can also impose limits on
a column, like denying NULL (
NULL) or requiring a unique value for each row
UNIQUE). Remember that
a NULL is not considered a value, so
UNIQUE does not imply
NOT NULL, nor does
UNIQUE imply only a single NULL is allowed. If
you want to keep NULL assignments out of a
UNIQUE column, you need to
explicitly mark the column
Column values can also be subjected to
arbitrary user-defined constraints before they are assigned
Some types of constraints also allow you to specify the action
to be taken in situations when the constraint would be violated.
See CREATE TABLE and UPDATE in Appendix C
for more specific details.
CREATE TABLE parts ( part_id INTEGER PRIMARY KEY, stock INTEGER DEFAULT 0 NOT NULL, desc TEXT CHECK( desc != '' ) -- empty strings not allowed );
In order to enforce a
UNIQUE column constraint, a
unique index will be automatically created over that column. A
different index will be created for each column (or set of
is some expense in maintaining an index, so be aware that
constraint can have performance considerations.
In addition to these other constraints, a
single column (or set of columns) can be designated as the
PRIMARY KEY. Each table
can have only one primary key. Primary keys must be unique, so
designating a column as
KEY implies the
UNIQUE constraint as well, and will result in
an automatic unique index being created. If a column is marked
PRIMARY KEY, only one index will
KEY does not imply
NULL. This is in contradiction to the SQL
standard and is considered a bug, but the behavior is so
long-standing that there are concerns about fixing it and
breaking existing applications. As a result, it is always a good
idea to explicitly mark at least one column from each
PRIMARY KEY as
There are also some good design reasons for
defining a primary key, which will be discussed in Tables and Keys, but the only significant,
concrete thing that comes out of defining a
PRIMARY KEY is the automatic
unique index. There are also some minor syntax shortcuts.
SQLite must have some column that can be
used to index the base storage for the table. In a sense, that
column acts as the master index that is used to store the table
itself. Like many other database systems, SQLite will silently
create a hidden
for this purpose. Different database systems use different names
so, in an effort to maintain compatibility, SQLite will
recognize the names
_ROWID_ to reference the root
columns are not returned (even for column wildcards), nor are
their values included in dump files.
If a table includes an
INTEGER PRIMARY KEY column, then
that column becomes an alias for the automatic
ROWID column. You can still
reference the column by any of the
ROWID names, but you can also reference the
column by its “real” user-defined name. Unlike
PRIMARY KEY by itself,
INTEGER PRIMARY KEY columns do
have an automatic
constraint associated with them. They are also strictly typed to
only accept integer values.
There are two significant advantages of
INTEGER PRIMARY KEY
columns. First, because the column aliases the table’s root
ROWID column, there is
no need for a secondary index. The table itself acts as the
index, providing efficient lookups without the maintenance costs
of an external index.
PRIMARY KEY columns can automatically provide
unique default values. When you insert a row without an explicit
value for the
ROWID alias) column,
SQLite will automatically choose a value that is one greater
than the largest existing value in the column. This provides an
easy means to automatically generate unique keys. If the maximum
value is reached, the database will randomly try other values,
looking for an unused key.
KEY columns can optionally be marked as
AUTOINCREMENT. In that case, the
automatically generated ID values will constantly increase,
preventing the reuse of an ID value from a previously deleted
row. If the maximum value is reached, insertions with automatic
INTEGER PRIMARY KEY
values are no longer possible. This is unlikely, however, as the
INTEGER PRIMARY KEY
type domain is large enough to allow 1,000 inserts per second
for almost 300 million years.
When using either automatic or
AUTOINCREMENT values, it is always possible to
insert an explicit
ROWID alias) value.
Other than the
KEY designation, SQLite offers no other type of
automatic sequence feature.
In addition to a
columns can also be marked as a
KEY. These columns reference rows in another
(foreign) table. Foreign keys can be used to create links
between rows in different tables. See Tables and Keys for details.
Table definitions can also include table-level constraints. In general, table constraints and column constraints work the same way. Table-level constraints still operate on individual rows. The main difference is that using the table constraint syntax, you can apply the constraint to a group of columns rather than just a single column. It is perfectly legal to define a table constraint with only one column, effectively defining a column constraint. Multicolumn constraints are sometimes known as compound constraints.
At the table level, SQLite supports the
PRIMARY KEY constraints. The
check constraint is very similar, requiring only an expression
PRIMARY KEY constraints, when
given as a table constraint, require a list of columns (e.g.,
(). As with column constraints, any
PRIMARY KEY (which
will automatically create a unique index over the appropriate
Table constraints that are applied to
multiple columns use the set of columns as a group. For example,
PRIMARY KEY is applied across
more than one column, each individual column is allowed to have
duplicate values. The constraint only prevents the set of values
across the designated columns from being replicated. If you
wanted each individual column to also be
UNIQUE, you’d need to add the
appropriate constraints to the individual columns.
CREATE TABLE rooms ( room_number INTEGER NOT NULL, building_number INTEGER NOT NULL, [...,] PRIMARY KEY( room_number, building_number ) );
Clearly we need to allow for more than one
room with the number 101. We also need to allow for more than
one room in building 103. But there should only be one room 101
in building 103, so we apply the constraint across both columns.
In this example, we’ve chosen to make these columns into a
compound primary key, since the building number and room number
combine to quintessentially define a specific room. Depending on
the design of the rest of the database, it might have been
equally valid to define a simple
UNIQUE constraint across these two columns, and
designated an arbitrary
room_id column as the primary key.
Tables from queries
CREATE [TEMP] TABLE
Using this form, you do not designate the
number of columns or their names or types. Rather, the query
statement is run and the output is used to define the column
names and preload the new table with data. With this syntax,
there is no way to designate column constraints or modifiers.
Any result column that is a direct column reference will inherit
the column’s affinity, but and all columns are given a
NONE affinity. The query
statement consists of a
SELECT command. More information on
SELECT can be found in Chapter 5.
Tables created in this manner are not
dynamically updated—the query command is run only once when the
table is created. Once the data is entered into the new table,
it remains unaltered until you change it. If you need a
table-like object that can dynamically update itself, use a
This example shows the optional
TEMP keyword (the full word
TEMPORARY can also be
CREATE TEMP TABLE.
This modifier can be used on any variation of
CREATE TABLE, but is frequently
used in conjunction with the
SELECT... variation shown here. Temporary tables
have two specific features. First, temporary tables can only be
seen by the database connection that created them. This allows
the simultaneous re-use of table names without any worry of
conflict between different clients. Second, all associated
temporary tables are automatically dropped and deleted whenever
a database connection is closed.
CREATE TABLE...AS SELECT is not the best choice
for creating standard
tables. If you need to copy data from an old table into a new
table, a better choice is to use
TABLE to define an empty table with all of the
appropriate column modifiers and table constraints. You can then
bulk copy all the data into the new table using a variation of
INSERT command that
allows for query statements. See INSERT for details.
SQLite supports a limited version of the
ALTER TABLE command. Currently, there are only
two operations supported by
TABLE: add column and
rename. The add column variant allows
you to add new columns to an existing table. It cannot remove
them. New columns are always added to the end of the column
list. Several other restrictions apply.
If you need to make a more significant change while preserving as much data as possible, you can use the rename variant to rename the existing table, create a new table under the original name, and then copy the data from the old table to the new table. The old table can then be safely dropped.
TABLE command is used to create tables and
DROP TABLE is used to delete them. The
DROP TABLE command deletes a
table and all of the data it contains. The table definition is
also removed from the database system catalogs.
In addition to deleting the table,
DROP TABLE will also drop any
indexes associated with the table. Both automatically created
indexes (such as those used to enforce a
UNIQUE constraint) as well as
manually created indexes will be dropped.
Virtual tables can be used to connect any data source to SQLite, including
other databases. A virtual table is created with the
CREATE VIRTUAL TABLE command.
Although very similar to
TABLE, there are important differences. For
example, virtual tables cannot be made temporary, nor do they
allow for an
IF NOT EXISTS
clause. To drop a virtual table, you use the normal
DROP TABLE command.
For more information on virtual tables,
including the full syntax for
TABLE, see Chapter 10.
Views provide a way to package queries into a predefined object. Once
created, views act more or less like read-only tables. Just like
tables, new views can be marked as
TEMP, with the same result. The basic syntax of
CREATE VIEW command is:
CREATE [TEMP] VIEW
syntax is almost identical to the
TABLE...AS SELECT command. This is because both
commands serve a similar purpose, with one important difference. The
result of a
CREATE TABLE command is
a new table that contains a full copy of the data. The
SELECT statement is run exactly once
and the output of the query is stored in the newly defined table. Once
created, the table will hold its own, independent copy of the
A view, on the other hand, is fully dynamic. Every
time the view is referenced or queried, the underlying
SELECT statement is run to regenerate
the view. This means the data seen in a view automatically updates as
the data changes in the underlying tables. In a sense, views are
almost like named queries.
Views are commonly used in one of two ways. First, they can be used to package up commonly used queries into a more convenient form. This is especially true if the query is complex and prone to error. By creating a view, you can be sure to get the same query each time.
Views are also commonly used to create user-friendly versions of standard tables. A common example are tables with date and time records. Normally, any time or date value is recorded in Coordinated Universal Time, or UTC. UTC is a more proper format for dates and times because it is unambiguous and time-zone independent. Unfortunately, it can also be a bit confusing if you’re several time zones away. It is often useful to create a view that mimics the base table, but converts all the times and dates from UTC into the local time zone. This way the data in the original tables remains unchanged, but the presentation is in units that are more user-friendly.
Indexes (or indices) are a means to optimize database lookups by pre-sorting and indexing one or more columns of a table. Ideally, this allows specific rows in a table to be found without having to scan every row in the table. In this fashion, indexes can provide a large performance boost to some types of queries. Indexes are not free, however, requiring updates with each modification to a table as well as additional storage space. There are even some situations when an index will cause a drop in performance. See Indexes for more information on when it makes sense to use an index.
The basic syntax for creating an index specifies the name of the new index, as well as the table and column names that are indexed. Indexes are always associated with one (and only one) table, but they can include one or more columns from that table:
CREATE [UNIQUE] INDEX
column_name[, ...] );
Normally indexes allow duplicate values. The
UNIQUE keyword indicates that
duplicate entries are not allowed, and any attempt to insert or update
a table with a nonunique value will cause an error. For unique indexes
that reference more than one column, all the columns must match for an
entry to be considered duplicate. As discussed with
CREATE TABLE, NULL isn’t considered a
value, so a
UNIQUE index will not
prevent one or more NULLs. If you want to prevent NULLs, you must
NOT NULL in the original
Each index is tied to a specific table, but they
all share a common namespace. Although you can name an index anything
you like, it is standard practice to name an index with a standard
prefix (such as
idx_), and then
include the table name and possibly the names of the columns included
in the index. For example:
CREATE INDEX idx_employees_name ON employees ( name );
As described earlier, the
CREATE TABLE command will automatically create unique
indexes to enforce a
PRIMARY KEY constraint. All
automatic indexes will start with an
sqlite_ prefix. Because these indexes are required to
enforce the table definition, they cannot be manually dropped with the
DROP INDEX command. Dropping
the automatic indexes would alter the table behavior as defined by the
Conversely, if you have manually defined a
UNIQUE index, dropping that
index will allow the database to insert or update redundant data. Be
careful when auditing indexes and remember that not all indexes are
created for performance reasons.