The Transaction Control Language is used in conjunction with the Data Manipulation Language to control the processing and exposure of changes. Transactions are a fundamental part of how relational databases protect the integrity and reliability of the data they hold. Transactions are automatically used on all DDL and DML commands.
A transaction is used to group together a series of low-level changes into a single, logical update. A transaction can be anything from updating a single value to a complex, multistep procedure that might end up inserting several rows into a number of different tables.
The classic transaction example is a database that holds account numbers and balances. If you want to transfer a balance from one account to another, that is a simple two-step process: subtract an amount from one account balance and then add the same amount to the other account balance. That process needs to be done as a single logical unit of change, and should not be broken apart. Both steps should either succeed completely, resulting in the balance being correctly transferred, or both steps should fail completely, resulting in both accounts being left unchanged. Any other outcome, where one step succeeds and the other fails, is not acceptable.
Typically a transaction is opened, or started. As individual data manipulation commands are issued, they become part of the transaction. When the logical procedure has finished, the transaction can be committed, which applies all of the changes to the permanent database record. If, for any reason, the commit fails, the transaction is rolled back, removing all traces of the changes. A transaction can also be manually rolled back.
The standard for reliable, robust transactions is the ACID test. ACID stands for Atomic, Consistent, Isolated, and Durable. Any transaction system worth using must possess these qualities.
- Atomic
A transaction should be atomic, in the sense that the change cannot be broken down into smaller pieces. When a transaction is committed to the database, the entire transaction must be applied or the entire transaction must not be applied. It should be impossible for only part of a transaction to be applied.
- Consistent
A transaction should also keep the database consistent. A typical database has a number of rules and limits that help ensure the stored data is correct and consistent with the design of the database. Assuming a database starts in a consistent state, applying a transaction must keep the database consistent. This is important, because the database is allowed to (and is often required to) become inconsistent while the transaction is open. For example, while transferring funds, there is a moment between the subtraction from one account and the addition to another account that the total amount of funds represented in the database is altered and may become inconsistent with a recorded total. This is acceptable, as long as the transaction, as a whole, is consistent when it is committed.
- Isolated
An open transaction must also be isolated from other clients. When a client opens a transaction and starts to issue individual change commands, the results of those commands are visible to the client. Those changes should not, however, be visible to any other system accessing the database, nor should they be integrated into the permanent database record until the entire transaction is committed. Conversely, changes committed by other clients after the transaction was started should not be visible to this transaction. Isolation is required for transactions to be atomic and consistent. If other clients could see half-applied transactions, the transactions could not claim to be atomic in nature, nor would they preserve the consistency of the database, as seen by other clients.
- Durable
Last of all, a transaction must be durable. If the transaction is successfully committed, it must have become a permanent and irreversible part of the database record. Once a success status is returned, it should not matter if the process is killed, the system loses power, or the database filesystem disappears—upon restart, the committed changes should be present in the database. Conversely, if the system loses power before a transaction is committed, then upon restart the changes made within the transaction should not be present.
Most people think that the atomic nature of transactions is their most important quality, but all four aspects must work together to ensure the overall integrity of the database. Durability, especially, is often overlooked. SQLite tries extremely hard to guarantee that if a transaction is successfully committed, those changes are actually physically written to permanent storage and are there to stay. Compare this to traditional filesystem operations, where writes might go into an operating system file cache. Updates may sit in the cache anywhere from a few seconds to a few minutes before finally being spooled off to storage. Even then, it is possible for the data to wait around in device buffers before finally being committed to physical storage. While this type of buffering can increase efficiency, it means that a normal application really has no idea when its data is safely committed to permanent storage.
Power failures and disappearing filesystems may seem like rare occurrences, but that’s not really the point. Databases are designed to deal with absolutes, especially when it comes to reliability. Besides, having a filesystem disappear is not that radical of an idea when you consider the prevalence of flash drives and USB thumb drives. Disappearing media and power failures are even more commonplace when you consider the number of SQLite databases that are found on battery-operated, handheld devices such as mobile phones and media players. The use of transactions is even more important on devices like this, since it is nearly impossible to run data recovery tools in that type of environment. These types of devices must be extremely robust and, no matter what the user does (including yanking out flash drives at inconvenient times), the system must stay consistent and reliable. Use of a transactional system can provide that kind of reliability.
Transactions are not just for writing data. Opening a transaction for an extended read-only operation is sometimes useful if you need to gather data with multiple queries. Having the transaction open keeps your view of the database consistent, ensuring that the data doesn’t change between queries. That is useful if, for example, you use one query to gather a bunch of record IDs, and then issue a series of queries against each ID value. Wrapping all the queries in a transaction guarantees all of the queries see the same set of data.
Normally, SQLite is in autocommit mode. This means that SQLite will automatically start a transaction for each command, process the command, and (assuming no errors were generated) automatically commit the transaction. This process is transparent to the user, but it is important to realize that even individually entered commands are processed within a transaction, even if no TCL commands are used.
The autocommit mode can be disabled by explicitly
opening a transaction. The BEGIN
command is used to start or open a transaction. Once an explicit
transaction has been opened, it will remain open until it is committed
or rolled back. The keyword
TRANSACTION
is optional:
BEGIN [ DEFERRED | IMMEDIATE | EXCLUSIVE ] [TRANSACTION]
The optional keywords
DEFERRED
, IMMEDIATE
, or EXCLUSIVE
are specific to SQLite and control how the
required read/write locks are acquired. If only one client is
accessing the database at a time, the locking mode is largely
irrelevant. When more than one client may be accessing the database,
the locking mode defines how to balance peer access with ensured
success of the transaction.
By default, all transactions (including autocommit
transactions) use the DEFERRED
mode. Under this mode, none of the database locks are acquired until
they are required. This is the most “neighborly” mode and allows other
clients to continue accessing and using the database until the
transaction has no other choice but to lock them out. This allows
other clients to continue using the database, but if the locks are not
available when the transaction requires them, the transaction will
fail and may need to be rolled back and restarted.
BEGIN IMMEDIATE
attempts to acquire a reserved lock immediately. If it
succeeds, it guarantees the write locks will be available to the
transaction when they are needed, but still allows other clients to
continue to access the database for read-only operations. The EXCLUSIVE
mode attempts to lock out
all other clients, including read-only
clients. Although the IMMEDIATE
and
EXCLUSIVE
modes are more
restrictive to other clients, the advantage is that they will fail
immediately if the required locks are not available, rather than after
you’ve issued your DDL or DML commands.
Once a transaction is open, you can continue to
issue other SQL commands, including both DML and DDL commands. You can
think of the changes resulting from these commands as “proposed”
changes. The changes are only visible to the local client and have not
been fully and permanently applied to the database. If the client
process is killed or the server loses power in the middle of an open
transaction, the transaction and any proposed changes it has will be
lost, but the rest of the database will remain intact and consistent.
It is not until the transaction is closed that the proposed changes
are committed to the database and made “real.” The COMMIT
command is used to close out a transaction and commit the changes
to the database. You can also use the alias
END
. As with BEGIN
, the TRANSACTION
keyword is optional.
COMMIT [TRANSACTION] END [TRANSACTION]
Once a COMMIT
has successfully returned, all the proposed changes are fully
committed to the database and become visible to other clients. At that
point, if the system loses power or the client process is killed, the
changes will remain safely in the database.
Things don’t always go right, however. Rather than committing the proposed changes, the transaction can be manually rolled back, effectively canceling the transaction and all of the changes it contains. Rolling back a set of proposed changes is useful if an error is encountered. This might be a database error, such as running out of disk space half-way through inserting a series of related records, or it might be an application logic error, such as trying to assign an invoice to an order that doesn’t exist. In such cases, it usually doesn’t make sense to continue with the transaction, nor does it make sense to leave inconsistent data in the database. Pretty much the only choice is to back out and try again.
To cancel the transaction and roll back all the
proposed changes, you can use the ROLLBACK
command. Again, the keyword TRANSACTION
is
optional:
ROLLBACK [TRANSACTION]
ROLLBACK
will
undo and revert all the proposed changes made by the current
transaction and then close the transaction. It does not necessarily
return the database to its prior state, as other clients may have been
making changes in parallel. A ROLLBACK
only cancels the proposed changes made by
this client within the current transaction.
Both COMMIT
and
ROLLBACK
will end the current
transaction, putting SQLite back into autocommit mode.
In addition to ACID-compliant transactions, SQLite also supports save-points. Save-points allow you to mark specific points in the transaction. You can then accept or rollback to individual save-points without having to commit or rollback an entire transaction. Unlike transactions, you can have more than one save-point active at the same time. Save-points are sometimes called nested transactions.
Save-points are generally used in conjunction with large, multistep transactions, where some of the steps or sub-procedures require rollback ability. Save-points allow a transaction to proceed and (if required) roll back one step at a time. They also allow an application to explore different avenues, attempting one procedure, and if that doesn’t work, trying another, without having to roll back the entire transaction to start over. In a sense, save-points can be thought of as “undo” markers in SQL command stream.
You can create a save-point with the
SAVEPOINT
command. Since multiple
save-points can be defined, you must provide a name to identify the
save-point:
SAVEPOINT savepoint_name
Save-points act as a stack. Whenever you create a new one, it is put at the top of the stack. Save-point identifiers do not need to be unique. If the same save-point identifier is used more than once, the one nearest to the top of the stack is used.
To release a save-point and accept all of the
proposed changes made since the save-point was set, use the
RELEASE
command:
RELEASE [SAVEPOINT] savepoint_name
The RELEASE
command does not commit any changes to disk. Rather, it flattens all
of the changes in the save-point stack into the layer below the named
save-point. The save-point is then removed. Any save-points contained
by the named save-point are automatically released.
To cancel a set of commands and undo everything
back to where a save-point was set, use the
ROLLBACK TO
command:
ROLLBACK [TRANSACTION] TO [SAVEPOINT] savepoint_name
Unlike a transaction ROLLBACK
, a save-point ROLLBACK TO
does not close out and eliminate the
save-point. ROLLBACK TO
rolls back
and cancels any changes issued since the save-point was established,
but leaves the transaction state exactly as it was
after the SAVEPOINT
command was issued.
Consider the following series of SQL statements. The indentation is used to show the save-point stack:
CREATE TABLE t (i); BEGIN; INSERT INTO t (i) VALUES 1; SAVEPOINT aaa; INSERT INTO t (i) VALUES 2; SAVEPOINT bbb; INSERT INTO t (i) VALUES 3;
At this point, if the command ROLLBACK TO bbb
is issued, the state
of the database would be as if the following commands were
entered:
CREATE TABLE t (i); BEGIN; INSERT INTO t (i) VALUES 1; SAVEPOINT aaa; INSERT INTO t (i) VALUES 2; SAVEPOINT bbb;
Again, notice that rolling back to save-point
bbb
still leaves the
save-point in place. Any new commands will be associated with SAVEPOINT bbb
. For example:
CREATE TABLE t (i); BEGIN; INSERT INTO t (i) VALUES 1; SAVEPOINT aaa; INSERT INTO t (i) VALUES 2; SAVEPOINT bbb; DELETE FROM t WHERE i=1;
Continuing, if the command RELEASE aaa
was issued, we would get
the equivalent of:
CREATE TABLE t (i); BEGIN; INSERT INTO t (i) VALUES 1; INSERT INTO t (i) VALUES 2; DELETE FROM t WHERE i=1;
In this case, the proposed changes from both the
aaa
and the enclosed bbb
save-points were released and
merged outward. The transaction is still open, however, and a COMMIT
would
still be required to make the proposed changes permanent.
Even if you have open save-points, you can still issue transaction commands. If the enclosing transaction is committed, all outstanding save-points will automatically be released and then committed. If the transaction is rolled back, all the save-points are rolled back.
If the SAVEPOINT
command is issued when SQLite is in
autocommit mode—that is, outside of a transaction—then a standard
autocommit BEGIN DEFERRED
TRANSACTION
will be started. However, unlike with most
commands, the autocommit transaction will not automatically commit
after the SAVEPOINT
command
returns, leaving the system inside an open transaction. The automatic
transaction will remain active until the original save-point is
released, or the outer transaction is either explicitly committed or
rolled back. This is the only situation when a save-point RELEASE
will have a direct effect on
the enclosing transaction. As with other save-points, if an autocommit
save-point is rolled back, the transaction will remain open and the
original save-point will be open, but empty.
Get Using SQLite now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.