The Data Manipulation Language is all about getting user data in and out of the database. After all the data structures and other database objects have been created with DDL commands, DML commands can be used to load those data structures full of useful data.
The DML supported by SQLite falls into two basic
categories. The first category consists of the “update” commands, which
includes the actual UPDATE
command, as
well as the INSERT
and DELETE
commands. As you might guess, these
commands are used to update (or modify), insert, and delete the rows of a
table. All of these commands alter the stored data in some way. The update
commands are the primary means of managing all the data within a
database.
The second category consists of the “query” commands,
which are used to extract data from the database. Actually, there is only
one query command: SELECT
. The SELECT
command not only prints returned
values, but provides a great number of options to combine different tables
and rows and otherwise manipulate data before returning the final
result.
SELECT
is,
unquestionably, the most complex SQL command. It is also, arguably, the most
important SQL command. This chapter will only cover the very basics of
SELECT
, and then we will spend the
next chapter going through all of its parts, bit by bit. To address the full
command syntax in detail, SELECT
gets a
whole chapter to itself (Chapter 5).
There are three commands used for adding,
modifying, and removing data from the database. INSERT
adds new rows, UPDATE
modifies existing rows, and
DELETE
removes rows. These
three commands are used to maintain all of the actual data values
within the database. All three update commands operate at a row level,
adding, altering, or removing the specified rows. Although all three
commands are capable of acting on multiple rows, each command can only
directly act upon rows contained within a single table.
The INSERT
command is used to create new rows in the specified table.
There are two meaningful versions of the command. The first
version uses a VALUES
clause
to specify a list of values to insert:
INSERT INTOtable_name
(column_name
[, ...]) VALUES (new_value
[, ...]);
A table name is provided, along with a list of columns and a list of values. Both lists must have the same number of items. A single new row is created and each value is recorded into its respective column. The columns can be listed in any order, just as long as the list of columns and the list of values line up correctly. Any columns that are not listed will receive their default values:
INSERT INTO parts ( name, stock, status ) VALUES ( 'Widget', 17, 'IN STOCK' );
In this example, we attempt to insert a new row into a “parts” table. Note the use of single quotes for text literals.
Technically, the list of column names is
optional. If no explicit list of columns is provided, the
INSERT
command will
attempt to pair up values with the table’s full list of
columns:
INSERT INTOtable_name
VALUES (new_value
[, ...]);
The trick with this format is that the
number and order of values must exactly match the number and
order of columns in the table definition. That means it is
impossible to use default values, even on INTEGER PRIMARY KEY
columns.
More often than not, this is not actually desirable. This format
is also harder to maintain within application source code, since
it must be meticulously updated if the table format changes. In
general, it is recommended that you always explicitly list out
the columns in an INSERT
statement.
When bulk importing data, it is common to
loop over data sets, calling INSERT
over and over. Processing these
statements one at a time can be fairly slow, since each command
will update both the table and any relevant indexes, and then
make sure the data is fully written out to physical disk before
(finally!) starting the next INSERT
. This is a fairly lengthly process,
since it requires physical I/O.
To speed up bulk inserts, it is common to
wrap groups of 1,000 to 10,000 INSERT
statements into a single transaction.
Grouping the statement together will substantially increase the
overall speed of the inserts by delaying the physical I/O until
the end of the transaction. See Transaction Control Language
for more information on transactions.
The second version of INSERT
allows you to define
values by using a query statement. This is very similar to the
CREATE TABLE...AS
SELECT
command, although the table must already
exist. This is the only version of INSERT
that can insert more than one row with a
single command:
INSERT INTOtable_name
(column_name
, [...]) SELECTquery_statement
;
This type of INSERT
is most commonly used to bulk copy data
from one table to another. This is a common operation when you
need to update the definition of a table, but you don’t want to
lose all the data that already exists in the database. The old
table is renamed, the new table is defined, and the data is
copied from the old table into the new table using an INSERT INTO...SELECT
command.
This form can also be used to populate temporary tables or copy
data from one attached database to another.
As with the VALUES
version of INSERT
, the column list is technically optional
but, for all the same reasons, it is still recommended that you
provide an explicit column list.
All versions of the INSERT
command also support an optional
conflict resolution clause. This conflict clause determines what
should be done if the results of the INSERT
would violate a database constraint. The
most common example is INSERT OR
REPLACE
, which comes into play when the INSERT
would, as executed, cause
a UNIQUE
constraint
violation. If the REPLACE
conflict resolution is present, any existing row that would
cause a UNIQUE
constraint
violation is first deleted, and then the INSERT
is allowed to continue.
This specific usage pattern is so common that the whole INSERT OR REPLACE
phrase can be
replaced by just REPLACE
. For
example, REPLACE INTO
.table_name
...
See INSERT and UPDATE in Appendix C for more information on the details of conflict resolution.
The UPDATE
command is used to assign new values to one or more columns of
existing rows in a table. The command can update more than one
row, but all of the rows must be part of the same table. The
basic syntax is:
UPDATEtable_name
SETcolumn_name
=new_value
[, ...] WHEREexpression
The command requires a table name followed
by a list of column name/value pairs that should be assigned.
Which rows are updated is determined by a conditional expression
that is tested against each row of the table. The most common
usage pattern uses the expression to check for equality on some
unique column, such as a PRIMARY
KEY
column.
Caution
If no WHERE
condition is given, the UPDATE
command will
attempt to update the designated columns in
every row of a table.
It is not considered an error if the
WHERE
expression
evaluates to false for every row in the table, resulting in no
actual updates.
Here is a more specific example:
-- Update the price and stock of part_id 454: UPDATE parts SET price = 4.25, stock = 75 WHERE part_id = 454;
This example assumes that the table parts
has at least three
columns: price
, stock
, and part_id
. The database will find
each row with a part_id
of
454. In this case, it can be assumed that part_id
is a PRIMARY KEY
column, so only one
row will be updated. The price
and stock
columns of that row are then assigned new
values.
The full syntax for UPDATE
can be found at UPDATE in Appendix C.
As you might guess, the DELETE
command is used to delete or remove one or more rows from a
single table. The rows are completely deleted from the
table:
DELETE FROMtable_name
WHEREexpression
;
The command requires only a table name and a
conditional expression to pick out rows. The WHERE
expression is used to
select specific rows to delete, just as it is used in the
UPDATE
command.
Caution
If no WHERE
condition is given, the DELETE
command will
attempt to delete every row of a
table.
As with UPDATE
, it is not considered an error if the
WHERE
expression
evaluates to false for every row in the table, resulting in no
actual deletions.
-- Delete the row with rowid 385: DELETE FROM parts WHERE part_id = 385; -- Delete all rows with a rowid greater than or equal to 43 -- and less than or equal to 246: DELETE FROM parts WHERE part_id >= 43 AND part_id <= 246;
These examples assume we have a table named
parts
that contains at
least one unique column named part_id
.
As noted, if no WHERE
clause is given, the DELETE
command will attempt to
delete every row in a table. SQLite optimizes this specific
case, truncating the full table, rather than processing each
individual row. Truncating the table is much faster than
deleting each individual row, but truncation bypasses the
individual row processing. If you wish to process each row as it
is deleted, provide a WHERE
clause that always evaluates to true:
DELETE FROM parts WHERE 1; -- delete all rows, force per-row processing
The existence of the WHERE
clause will prevent the
truncation, allowing each row to be processed in turn.
The final DML command to cover is the SELECT
command. SELECT
is used to extract
or return values from the database. Almost any time you want to
extract or return some kind of value, you’ll need to use the SELECT
command. Generally, the
returned values are derived from the contents of the database, but
SELECT
can also be used to
return the value of simple expressions. This is a great way to test
out expressions, for example:
sqlite> SELECT 1+1, 5*32, 'abc'||'def', 1>2;
1+1 5*32 'abc' || 'def' 1>2
---------- ---------- -------------- ----------
2 160 abcdef 0
SELECT
is a
read-only command, and will not modify the database (unless the
SELECT
is embedded in a
different command, such as a CREATE
TABLE...AS SELECT
or an INSERT INTO...SELECT
).
Without question, SELECT
is the most complex SQL command, both in terms
of syntax as well as function. The SELECT
syntax tries to represent a generic framework
that is capable of expressing a great many different types of queries.
While it is somewhat successful at this, there are areas where
SELECT
has traded away
simplicity for more flexibility. As a result, SELECT
has a large number of optional clauses, each
with its own set of options and formats.
Understanding how to mix and match these optional
clauses to get the result you’re looking for can take some time. While
the most basic syntax can be shown with a good set of examples, to
really wrap your head around SELECT
, it is best to understand how it actually works and
what it is trying to accomplish.
Because SELECT
can be so complex, and because SELECT
is an extremely important command, we will
spend the whole next chapter looking very closely at SELECT
and each of its clauses. There
will be some discussion about what is going on behind the scenes, to
provide more insight into how to read and write complex
queries.
For now, we’ll just give you a taste. That should
provide enough information to play around with the other commands in
this chapter. The most basic form of SELECT
is:
SELECToutput_list
FROMinput_table
WHERErow_filter
;
The output list is a list of expressions that
should be evaluated and returned for each resulting row. Most
commonly, this is simply a list of columns. The output list can also
include a wildcard (*
) that
indicates all known columns should be returned.
The FROM
clause
defines the source of the table data. The next chapter will show
how tables can be linked and joined, but for now we’ll stick with
querying one table at a time.
The WHERE
clause is a conditional filtering expression that is applied to each
row. It is essentially the same as the WHERE
clause in the UPDATE
and DELETE
commands. Those rows that evaluate to true will be part of the result,
while the other rows will be filtered out.
sqlite>CREATE TABLE tbl ( a, b, c, id INTEGER PRIMARY KEY );
sqlite>INSERT INTO tbl ( a, b, c ) VALUES ( 10, 10, 10 );
sqlite>INSERT INTO tbl ( a, b, c ) VALUES ( 11, 15, 20 );
sqlite>INSERT INTO tbl ( a, b, c ) VALUES ( 12, 20, 30 );
We can return the whole table like this:
sqlite> SELECT * FROM tbl;
a b c id
---------- ---------- ---------- ----------
10 10 10 1
11 15 20 2
12 20 30 3
We can also just return specific columns:
sqlite> SELECT a, c FROM tbl;
a c
---------- ----------
10 10
11 20
12 30
sqlite> SELECT * FROM tbl WHERE id = 2;
a b c id
---------- ---------- ---------- ----------
11 15 20 2
For more specifics, see Chapter 5 and SELECT in Appendix C.
Get Using SQLite now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.