SQL is the language that you use to manipulate relational data. Other languages for data manipulation do exist, but SQL is the most common, and it's the reason for the "SQL" in "SQL*Plus." Because everything you do in SQL*Plus will involve SQL, it's worth digressing for a chapter to focus on learning the rudiments of the language.
As manifested in Oracle, the SQL language encompasses several categories of statements: statements that work with data, other statements that define and modify the structures (such as tables and indexes) that hold data, and still other statements that control the operation of the database itself.
This chapter focuses on statements used to manipulate data:
- INSERT
Places new records, or rows, into a database table.
- SELECT
Retrieves previously inserted rows from a database table.
- UPDATE
Modifies data in a table.
- DELETE
Deletes data from a table.
- MERGE
Brings a table up to date by modifying or inserting rows, as appropriate
The reason I focus on data manipulation in this chapter is that if you're just starting out with SQL, data manipulation is likely to be your first problem. You'll likely be working with tables and other database objects that have been created. Furthermore, data manipulation statements are the ones you'll need to leverage the SQL*Plus reporting capabilities I describe in the next few chapters. Other topics, such as creating database structures (e.g., tables and indexes) and controlling the operation of the database, are database administration topics that rightfully belong in other books.
Use the INSERT statement to add a new row of data to a database table. The following form of INSERT should suffice for almost all your needs:
INSERT INTOtable_name
(column_list
) VALUES (value_list
);
Replace table_name
with the name of
the target table, the table to which you wish to add a new row of
data. Replace column_list
with a
comma-delimited list of column names. These are the columns for which
you will supply values in your value_list
.
For example, type the code from Example 4-1 into SQL*Plus to
insert a new employee.
Example 4-1. Inserting a new employee
INSERT INTO employee (employee_id, employee_name) VALUES ('114','Marusia Bohuslavka');
This example supplies values for only two of the five employee
table columns. When you insert a
row and omit a value for a column, that column takes on a default
value specified by your DBA when creating the table. Often, the
default value for a column is no value at all, or null. I'll talk
about nulls later in this chapter. They're quite important to
understand.
When you insert new rows, you must supply values for the following types of columns:
Primary key columns. These are mandatory. In the
employee
table, theemployee_id
column forms the primary key.Columns defined as NOT NULL that have no default value defined.
If you omit a value for one of these types of column, you'll receive an error, as in Example 4-2.
Example 4-2. Omitting a required column
INSERT INTO employee (employee_name) VALUES ('Hryhory Kytasty');
*
ERROR at line 1:
ORA-01400: cannot insert NULL into ("GENNICK"."EMPLOYEE"."EMPLOYEE_ID")
The error message here is informing you that employee_id
is a mandatory column. When you
receive an error such as this, supply a value for the column mentioned
in the error message and retry the insert.
What good would a database be if you couldn't get information back from it? Indeed, you'll find that you spend more time using SQL to get data out of the database than you spend on any of the other data manipulation operations. The SELECT statement, often referred to as a query, is what you need. Example 4-3 shows a simple query that retrieves all data from a table.
The asterisk (*) in this example is shorthand for "all columns." You can enumerate the columns, as in Example 4-4.
Example 4-4. Enumerating columns in the SELECT list
SELECT employee_id, employee_name, employee_hire_date, employee_termination_date, employee_billing_rate FROM employee;
In a database of any size, you aren't likely to want to retrieve all rows from a table. Rather, you'll want to define some subset of rows that you are interested in. For that, use the WHERE clause. Example 4-5 uses a WHERE clause to restrict the query's results to the row inserted in the previous section.
Example 4-5. Using a WHERE clause to constrain results
SELECT employee_id, employee_name FROM employee WHERE employee_id = 114;
The WHERE clause supports the same comparison operators that you'll find in just about any programming or scripting language:
- =
Are two values the same?
- !=, <>, ^=
Do two values differ?
- <
Is one value less than another?
- <=
Is one value less than or equal to another?
- >
Is one value greater than another?
- >=
Is one value greater than or equal to another?
These are the most basic and common operators. You'll see others in this book that I'll explain as I continue. For a rigorous trip through the WHERE clause, I heartily recommend Sanjay Mishra and Alan Beaulieu's book, Mastering Oracle SQL, Second Edition (O'Reilly).
You can write many conditions in a WHERE clause, which you can link together using AND and OR. You can use parentheses to clarify the order of evaluation. Example 4-6 uses OR to retrieve the following two groups of employees:
Employee number 114
Employees with names beginning with '
Myk
', but only when those employees were hired during or after 2004
Example 4-6. Using AND and OR
SELECT * FROM employee WHERE (employee_id = 114) OR (employee_hire_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy') AND REGEXP_LIKE(employee_name, '^Myk.*'));
Parentheses ensure that the two AND conditions are collectively treated as one condition with respect to the OR operation.
The TO_DATE function converts the string '1-Jan-2004
' into a true DATE value, for
comparison to the hire dates. The call to REGEXP_LIKE identifies
those rows having names that begin with 'Myk
'. The string '^Myk.*
' is a regular expression.
There is also the NOT operator, which you can use to negate a condition. For example, the two queries in Example 4-7 are equivalent.
Example 4-7. Using the NOT operator
SELECT * FROM employee WHERE employee_id <> 114; SELECT * FROM employee WHERE NOT employee_id = 114;
NOT can be handy when writing complex queries because sometimes it's easier to think in terms of those rows that you do not want in your result. You can write conditions to identify those rows you do not want, and you can use NOT to negate those conditions. Example 4-8 shows a slightly modified (NOTified?) version of the query from Example 4-6. This time, I want the statement to return all the rows not returned in Example 4-6. I could have modified each of the three comparisons individually, but that would require a fair bit of thought, and I might make a mistake. It's easier to wrap the entire original logic in parentheses and apply the NOT operator to the result.
Example 4-8. Negating complex logic
SELECT * FROM employee WHERE NOT ((employee_id = 114) OR (employee_hire_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy') AND REGEXP_LIKE(employee_name, '^Myk.*')));
Using NOT as I've just done leaves me open to problems involving null values. This is an important topic I'll discuss later in "The Concept of Null."
When you're writing a query, it's often necessary to qualify
column names by also specifying their table names. You do this using
dot notation, as shown in Example 4-9. The employee
. in front of each column name
specifies that each column belongs to the employee
table.
Example 4-9. Qualifying column names with their table names
SELECT employee.employee_id, employee.employee_name FROM employee;
It's cumbersome to retype long table names many times over. For this reason, SQL allows you to specify alternate names, known as table aliases, for the tables in your query. Specify an alias by placing it immediately after its table name, thereby separating the two names by whitespace, as shown in Example 4-10.
Example 4-10. Specifying and using a table alias
SELECT e.employee_id, e.employee_name FROM employee e;
Keep your aliases short, which is the whole idea. It's much
easier to type e
. than employee
. in front of each column
name.
You really don't need to worry about qualifying column names when selecting from only one table. Later in this chapter, you'll learn how to select from many tables at once. That's when qualifying column names becomes important. If you don't qualify column names when selecting from multiple tables, the database engine must expend extra resources sorting through which name goes with which table. That's usually a trivial performance hit, but it can add up if given enough users and statements. And woe be to you if two tables share a common column name because the database engine won't be able to determine your intent, and you'll get an error.
Just as you can specify aliases for tables, you can also specify aliases for columns. Many people specify column aliases just as they do table aliases, by placing the alias name immediately after the column name. Others are a bit more proper in their approach and use the AS keyword. Example 4-11 demonstrates both approaches.
Example 4-11. Specifying column aliases
SELECT e.employee_id emp_num, e.employee_name "Employee Name", e.employee_billing_rate * 0.50 discounted_rate FROM employee e; SELECT e.employee_id AS emp_num, e.employee_name AS "Employee Name", e.employee_billing_rate * 0.50 AS discounted_rate FROM employee e;
Column aliases are useful for reporting, because SQL*Plus
picks them up and uses them as default column titles. Having
"Employee Name
" at the top of a
column looks much better to a user than "employee_name
". Column aliases are also
useful in naming expressions in your SELECT list such as the one in
Example 4-11 that
discounts the billing rate. Otherwise, the expression itself will be
the column name. In Chapter
5, you'll see that it's much easier to define formatting for
an expression column when the name is one that
you have specified rather than one that the
database has generated for you.
Things change. That seems to be the rule in our world today. Your data will change, too, and for that reason SQL provides the UPDATE statement. Use it to set new values for some or all columns in a row. The UPDATE in Example 4-12 changes the name and the budget for project #1005.
Example 4-12. Changing values in a single row
UPDATE project SET project_name = 'Virtual Private Network', project_budget = 199999.95 WHERE project_id=1005;
The update in Example
4-12 updates only a single row. You can be certain of that,
because project_id
is the primary
key of the project
table, so there
can be at most one row with ID 1005. You can use UPDATE to change
values in many rows, by writing conditions in your WHERE clause to
target those rows. Example
4-13 shows budgets being cut by 30% for all projects but the
corporate web site.
Example 4-13. Changing values in many rows
UPDATE project SET project_budget = project_budget * 0.70 WHERE NOT project_name = 'Corporate Web Site';
You'll notice that the new value for project_budget
in Example 4-13 is the result of
an expression; in Example
4-12, the new values are constants. When updating many rows, it
rarely makes sense to apply the same value to all, so such updates
should use expressions or subqueries to generate new values
appropriate to each row. (See the Section 4.7 for more
information on that approach.)
When you write an update, be sure you know whether it potentially affects more than one row. If the WHERE clause does not specify a single value for the table's primary key, or for a unique key column, then the update could potentially affect many rows. When an update affects many rows, give careful thought to how you will compute new values for those rows.
Tip
If you know for certain that you can uniquely identify a single row via a non-key field, go for it. However, you may want to issue a SELECT first just to be sure. Example 4-13 identified a project by name. Potentially, two projects can have the same name, but I knew that wasn't the case when I wrote the example. In the real world, sometimes you take advantage of what you know about your own data.
It's possible to issue an UPDATE that changes the value in a primary key, but such changes aren't often made, and you're better off avoiding them if you can. Database designers design databases in ways that minimize or eliminate the need to change primary key values. When you change the primary key of a row, you affect the referential integrity of your database. The change must be rippled through to any other rows that refer to the row you are changing. Alternately, the database must prevent you from changing a primary key value that is referenced by another row. Much depends here on how your database schema was designed. In a worst-case scenario, you might have to sort out the proper order for a whole series of updates to foreign-key columns that reference the primary-key value you wish to change.
All things must come to an end, and that's true of data, too. Actually, with today's increasing hard-drive sizes, I'm not so sure that some don't plan to hold their data forever, but let's ignore that complication for now. Use the DELETE statement to get rid of rows you no longer need. Did you insert all those new projects by executing the code in the "INSERT . . . SELECT FROM" sidebar? Too bad. Now your boss has changed his mind. Isn't that annoying? Now it's your job to delete what you just created. Do that using the DELETE statement shown in Example 4-14.
DELETE is a fairly simple statement, but be careful to craft your WHERE clause so you only delete rows that you want to delete. I often write a SELECT statement using the same WHERE conditions and check the results from that before unleashing a DELETE.
New in Oracle9i Database, and in the 2003 ANSI/ISO SQL standard, the MERGE statement solves the age-old problem of needing to update a table or insert a new row depending on whether a corresponding row already exists. Suppose that you've created those overhead projects mentioned in the "INSERT . . . SELECT FROM" sidebar. You'll have used the statement shown in Example 4-15.
Example 4-15. Creating overhead projects
INSERT INTO PROJECT (project_id, project_name) SELECT project_id+8000, SUBSTR(project_name,1,31) || ' Overhead' FROM project;
Example 4-15 creates a new set of projects from an existing set of rows. Now, let's say you're faced with the task of periodically refreshing the list of overhead projects. At the time you do a refresh, you need to account for two possibilities:
For any new projects that have been added, you need to create new overhead projects.
For any existing projects, you need to migrate any name changes to their respective overhead project names.
There are different ways that you can go about solving this problem. One way is to periodically issue the MERGE statement shown in Example 4-16.
Example 4-16. Bringing the list of overhead projects up to date
MERGE INTO project pdest USING (SELECT project_id+8000 project_id, SUBSTR(project_name,1,31) || ' Overhead' project_name FROM project WHERE project_id <= 8000) psource ON (pdest.project_id = psource.project_id) WHEN MATCHED THEN UPDATE SET pdest.project_name = psource.project_name WHEN NOT MATCHED THEN INSERT (project_id, project_name) VALUES (psource.project_id, psource.project_name);
Here's a step-by-step walkthrough of this MERGE statement:
The INTO clause identifies the target table, in this case,
project
. An alias ofpdest
is specified to make it easier to distinguish between the two occurrences of the table in the one statement.The USING clause specifies a SELECT statement to use in generating rows for the merge operation. This SELECT statement's WHERE clause excludes any overhead rows, which are rows having
project_id
values greater than 8000. The results of this SELECT statement are identified through the aliaspsource
.Each row from the SELECT is matched to its corresponding row in
pdest
via the condition given in the ON clause.When a match occurs, which is to say that a row in
pdest
has aproject_id
matching that of a given row inpsource
, an update of theproject_name
is triggered.When no match occurs (for a given row from
psource
, no corresponding row inpdest
can be found) a new row is inserted intopdest
.
MERGE is commonly used in data warehousing environments to periodically update reporting tables from operational data. Often, the source for rows feeding into a MERGE operation will be an external table, a type of table corresponding to an operating system file. There's more in Chapter 9 about using external tables to load data from files.
Get Oracle SQL*Plus: The Definitive Guide, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.