BUY THIS BOOK
Add to Cart

Print Book $39.95


Add to Cart

Print+PDF $51.94

Add to Cart

PDF $31.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £28.50

What is this?

Looking to Reprint or License this content?


Mastering Oracle SQL
Mastering Oracle SQL, Second Edition

By Sanjay Mishra, Alan Beaulieu
Book Price: $39.95 USD
£28.50 GBP
PDF Price: $31.99

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction to SQL
In this introductory chapter, we explore the origin and utility of the SQL language, demonstrate some of the more useful features of the language, and define a simple database design from which most examples in the book are derived.
SQL is a special-purpose language used to define, access, and manipulate data. SQL is nonprocedural, meaning that it describes the necessary components (i.e., tables) and desired results without dictating exactly how those results should be computed. Every SQL implementation sits atop a database engine , whose job it is to interpret SQL statements and determine how the various data structures in the database should be accessed to accurately and efficiently produce the desired outcome.
The SQL language includes two distinct sets of commands: Data Definition Language (DDL) is the subset of SQL used to define and modify various data structures, while Data Manipulation Language (DML) is the subset of SQL used to access and manipulate data contained within the data structures previously defined via DDL. DDL includes numerous commands for handling such tasks as creating tables, indexes, views, and constraints, while DML is comprised of just five statements:
INSERT
Adds data to a database.
UPDATE
Modifies data in a database.
DELETE
Removes data from a database.
MERGE
Adds and/or modifies data in a database. MERGE is part of the 2003 ANSI SQL standard.
SELECT
Retrieves data from a database.
Some people feel that DDL is the sole property of database administrators, while database developers are responsible for writing DML statements, but the two are not so easily separated. It is difficult to efficiently access and manipulate data without an understanding of what data structures are available and how they are related; likewise, it is difficult to design appropriate data structures without knowledge of how the data will be accessed. That being said, this book deals almost exclusively with DML, except where DDL is presented to set the stage for one or more DML examples. The reasons for focusing on just the DML portion of SQL include:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is SQL?
SQL is a special-purpose language used to define, access, and manipulate data. SQL is nonprocedural, meaning that it describes the necessary components (i.e., tables) and desired results without dictating exactly how those results should be computed. Every SQL implementation sits atop a database engine , whose job it is to interpret SQL statements and determine how the various data structures in the database should be accessed to accurately and efficiently produce the desired outcome.
The SQL language includes two distinct sets of commands: Data Definition Language (DDL) is the subset of SQL used to define and modify various data structures, while Data Manipulation Language (DML) is the subset of SQL used to access and manipulate data contained within the data structures previously defined via DDL. DDL includes numerous commands for handling such tasks as creating tables, indexes, views, and constraints, while DML is comprised of just five statements:
INSERT
Adds data to a database.
UPDATE
Modifies data in a database.
DELETE
Removes data from a database.
MERGE
Adds and/or modifies data in a database. MERGE is part of the 2003 ANSI SQL standard.
SELECT
Retrieves data from a database.
Some people feel that DDL is the sole property of database administrators, while database developers are responsible for writing DML statements, but the two are not so easily separated. It is difficult to efficiently access and manipulate data without an understanding of what data structures are available and how they are related; likewise, it is difficult to design appropriate data structures without knowledge of how the data will be accessed. That being said, this book deals almost exclusively with DML, except where DDL is presented to set the stage for one or more DML examples. The reasons for focusing on just the DML portion of SQL include:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Brief History of SQL
In the early 1970s, an IBM research fellow named Dr. E. F. Codd endeavored to apply the rigors of mathematics to the then-untamed world of data storage and retrieval. Codd's work led to the definition of the relational data model and a language called DSL/Alpha for manipulating data in a relational database. IBM liked what they saw, so they commissioned a project called System/R to build a prototype based on Codd's work. Among other things, the System/R team developed a simplified version of DSL called SQUARE, which was later renamed SEQUEL, and finally renamed SQL.
The work done on System/R eventually led to the release of various IBM products based on the relational model. Other companies, such as Oracle, rallied around the relational flag as well. By the mid 1980s, SQL had gathered sufficient momentum in the marketplace to warrant oversight by the American National Standards Institute (ANSI). ANSI released its first SQL standard in 1986, followed by updates in 1989, 1992, 1999, and 2003. There will undoubtedly be further refinements in the future.
Thirty years after the System/R team began prototyping a relational database, SQL is still going strong. While there have been numerous attempts to dethrone relational databases in the marketplace, well-designed relational databases coupled with well-written SQL statements continue to succeed in handling large, complex data sets where other methods fail.
Given that Oracle was an early adopter of the relational model and SQL, one might think that they would have put a great deal of effort into conforming with the various ANSI standards. For many years, however, the folks at Oracle seemed content that their implementation of SQL was functionally equivalent to the ANSI standards without being overly concerned with true compliance. Beginning with the release of Oracle8i, however, Oracle has stepped up its efforts to conform to ANSI standards and has tackled such features as the CASE statement and the left/right/full outer join syntax.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Simple Database
Because this is a practical book, it contains numerous examples. Rather than fabricating different sets of tables and columns for every chapter or section in the book, we have decided to draw from a single, simple schema for most examples. The subject area that we chose to model is a parts distributor, such as an auto-parts wholesaler or medical device distributor, in which the business fills customer orders for one or more parts that are supplied by external suppliers. Figure 1-1 shows the entity-relationship model for this business.
Figure 1-1: The parts distributor model
If you are unfamiliar with entity-relationship models, here is a brief description of how they work. Each box in the model represents an entity, which correlates to a database table. The lines between the entities represent the relationships between tables, which correlate to foreign keys. For example, the cust_order table holds a foreign key to the employee table, which signifies the salesperson responsible for a particular order. Physically, this means that the cust_order table contains a column holding employee ID numbers, and that, for any given order, the employee ID number indicates the employee who sold that order. If you find this confusing, simply use the diagram as an illustration of the tables and columns found within our database. As you work your way through the SQL examples in this book, return occasionally to the diagram, and you should find that the relationships start making sense.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
DML Statements
In this section, we will introduce the five statements that comprise the DML portion of SQL. The information presented in this section should be enough to allow you to start writing DML statements. As is discussed at the end of the section, however, DML can look deceptively simple, so keep in mind while reading the section that there are many more facets to DML than are discussed here.
The SELECT statement is used to retrieve data from a database. The set of data retrieved via a SELECT statement is referred to as a result set. Like a table, a result set is comprised of rows and columns, making it possible to populate a table using the result set of a SELECT statement. The SELECT statement can be summarized as follows:
SELECT <one or more things>
FROM <one or more places>
WHERE <zero, one, or more conditions apply>
While the SELECT and FROM clauses are required, the WHERE clause is optional (although you will seldom see it omitted). We will therefore begin with a simple example that retrieves three columns from every row of the customer table:
               SELECT cust_nbr, name, region_id
               FROM customer;

  CUST_NBR NAME                            REGION_ID
---------- ------------------------------ ----------
         1 Cooper Industries                       5
         2 Emblazon Corp.                          5
         3 Ditech Corp.                            5
         4 Flowtech Inc.                           5
         5 Gentech Industries                      5
         6 Spartan Industries                      6
         7 Wallace Labs                            6
         8 Zantech Inc.                            6
         9 Cardinal Technologies                   6
        10 Flowrite Corp.                          6
        11 Glaven Technologies                     7
        12 Johnson Labs                            7
        13 Kimball Corp.                           7
        14 Madden Industries                       7
        15 Turntech Inc.                           7
        16 Paulson Labs                            8
        17 Evans Supply Corp.                      8
        18 Spalding Medical Inc.                   8
        19 Kendall-Taylor Corp.                    8
        20 Malden Labs                             8
        21 Crimson Medical Inc.                    9
        22 Nichols Industries                      9
        23 Owens-Baxter Corp.                      9
        24 Jackson Medical Inc.                    9
        25 Worcester Technologies                  9
        26 Alpha Technologies                     10
        27 Phillips Labs                          10
        28 Jaztech Corp.                          10
        29 Madden-Taylor Inc.                     10
        30 Wallace Industries                     10
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
So Why Are There 17 More Chapters?
After reading this chapter, you might think that SQL looks pretty simple (at least the DML portion). At a high level, it is fairly simple, and you now know enough about the language to go write some code. However, you will learn over time that there are numerous ways to arrive at the same end point, and some are more efficient and elegant than others. The true test of SQL mastery is when you no longer have the desire to return to what you were working on the previous year, rip out all the SQL, and recode it. For one of us, it took about nine years to reach that point. Hopefully, this book will help you reach that point in far less time.
While you are reading the rest of the book, you might notice that the majority of examples use SELECT statements, with the remainder somewhat evenly distributed across INSERT, UPDATE, and DELETE statements. This disparity is not indicative of the relative importance of SELECT statements over the other three DML statements; rather, SELECT statements are favored because we can show a query's result set, which should help you to better understand the query, and because many of the points being made using SELECT statements can be applied to UPDATE and DELETE statements as well.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: The WHERE Clause
Whether you are querying, modifying, or deleting data, the WHERE clause is the mechanism for identifying what data you want to work with. This chapter explores the role of the WHERE clause in SQL statements, as well as the various options available when building a WHERE clause.
Before delving into the WHERE clause, let's imagine life without it. Say that you are interested in doing some maintenance on the data in the part table. To inspect the data in the table, you issue the following query:
SELECT part_nbr, name, supplier_id, status, inventory_qty
FROM part;
If the part table contains 10,000 items, the result set returned by the query would consist of 10,000 rows, each with 5 columns. You would then load the 10,000 rows into memory and make your modifications.
Once you have made the required modifications to your data in memory, it is time to apply the changes to the part table. Without the ability to specify the rows to modify, you have no choice but to delete all rows in the table and re-insert all 10,000 rows:
DELETE FROM part;

INSERT INTO part (part_nbr, name, supplier_id, status, inventory_qty)
VALUES ('XY5-1002', 'Wonder Widget', 1, 'IN-STOCK', 1);

/* 9,999 more INSERTs on the wall, 9,999 more INSERTS . . .  */
While this approach works in theory, it wreaks havoc on performance, concurrency (the ability for more than one user to modify data simultaneously), and scalability (the ability to perform predictably as load increases).
Now imagine that you want to modify data in the part table only for those parts supplied by Acme Industries. Since the supplier's name is stored in the supplier table, you must include both the part and supplier tables in the FROM clause:
SELECT p.part_nbr, p.name, p.supplier_id, p.status, p.inventory_qty,
  s.supplier_id, s.name
FROM part p, supplier s;
If 100 companies supply the 10,000 parts in the part table, this query will return 1,000,000 rows. Known as the Cartesian product
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Life Without WHERE
Before delving into the WHERE clause, let's imagine life without it. Say that you are interested in doing some maintenance on the data in the part table. To inspect the data in the table, you issue the following query:
SELECT part_nbr, name, supplier_id, status, inventory_qty
FROM part;
If the part table contains 10,000 items, the result set returned by the query would consist of 10,000 rows, each with 5 columns. You would then load the 10,000 rows into memory and make your modifications.
Once you have made the required modifications to your data in memory, it is time to apply the changes to the part table. Without the ability to specify the rows to modify, you have no choice but to delete all rows in the table and re-insert all 10,000 rows:
DELETE FROM part;

INSERT INTO part (part_nbr, name, supplier_id, status, inventory_qty)
VALUES ('XY5-1002', 'Wonder Widget', 1, 'IN-STOCK', 1);

/* 9,999 more INSERTs on the wall, 9,999 more INSERTS . . .  */
While this approach works in theory, it wreaks havoc on performance, concurrency (the ability for more than one user to modify data simultaneously), and scalability (the ability to perform predictably as load increases).
Now imagine that you want to modify data in the part table only for those parts supplied by Acme Industries. Since the supplier's name is stored in the supplier table, you must include both the part and supplier tables in the FROM clause:
SELECT p.part_nbr, p.name, p.supplier_id, p.status, p.inventory_qty,
  s.supplier_id, s.name
FROM part p, supplier s;
If 100 companies supply the 10,000 parts in the part table, this query will return 1,000,000 rows. Known as the Cartesian product, this number equates to every possible combination of all rows from the two tables. As you sift through the million rows, you would keep only those where the values of p.supplier_id and s.supplier_id are identical and where the s.name column matches 'Acme Industries'. If Acme Industries supplies only 50 of the 10,000 parts in your database, you will end up discarding 999,950 of the 1,000,000 rows returned by your query.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
WHERE to the Rescue
Hopefully, the scenarios in the previous section give you some insight into the utility of the WHERE clause, including the ability to:
  • Filter out unwanted data from a query's result set.
  • Isolate one or more rows of a table for modification.
  • Conditionally join two or more data sets together.
To see how these things are accomplished, let's add a WHERE clause to the previous SELECT statement, which strives to locate all parts supplied by Acme Industries. Here's the query with the new WHERE clause:
SELECT p.part_nbr, p.name, p.supplier_id, p.status, p.inventory_qty,
  s.supplier_id, s.name
FROM part p, supplier s
WHERE s.supplier_id = p.supplier_id
  AND s.name = 'Acme Industries';
The WHERE clause here is comprised of two parts, known as conditions, which are evaluated separately. Conditions always evaluate to either TRUE or FALSE; if there are multiple conditions in a WHERE clause, they all must evaluate to TRUE for a given row to be included in the result set. Actually, that's a bit of an oversimplification. As you will see later, using the OR and NOT operators allows the WHERE clause to evaluate to TRUE even if individual conditions evaluate to FALSE.
For this example, a row created by combining data from the part and supplier tables will only be included in the final result set if both tables share a common value for the supplier_id column, and if the value of the name column in the supplier table matches 'Acme Industries'. Any other permutation of data from the two tables would evaluate to FALSE and be discarded.
For this chapter only, we'll use the older style of join syntax in which you specify join conditions in the WHERE clause. We do this to explore the full functionality of the WHERE clause.
With the addition of the WHERE clause to the previous example, therefore, Oracle will take on the work of discarding undesired rows from the result set, and only 50 rows would be returned by the query, rather than 1,000,000. Now that you have retrieved the 50 rows of interest from the database, you can begin the process of modifying the data. Keep in mind, however, that with the WHERE clause at your disposal you will no longer need to delete and re-insert your modified data; instead, you can use the UPDATE statement to modify specific rows based on the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
WHERE Clause Evaluation
Now that you have seen the WHERE clause in action, let's take a look at how it is evaluated. As previously mentioned, the WHERE clause consists of one or more conditions that evaluate independently to TRUE or FALSE. If your WHERE clause consists of multiple conditions, the conditions are separated by the logical operators AND and OR. Depending on the outcome of the individual conditions and the placement of these logical operators, Oracle will assign a final value of TRUE or FALSE to each candidate row, thereby determining whether a row will be included in the final result set.
Here's another look at the Acme Industries query:
SELECT p.part_nbr, p.name, p.supplier_id, p.status, p.inventory_qty,
  s.supplier_id, s.name
FROM part p, supplier s
WHERE s.supplier_id = p.supplier_id
  AND s.name = 'Acme Industries';
The WHERE clause consists of two conditions separated by AND. Thus, a row will only be included if both conditions evaluate to TRUE. Table 2-1 shows the possible scenarios when conditions are replaced by their possible outcomes.
Table 2-1: Multiple-condition evaluation using AND
Intermediate result
Final result
WHERE TRUE AND TRUE
TRUE
WHERE FALSE AND FALSE
FALSE
WHERE FALSE AND TRUE
FALSE
WHERE TRUE AND FALSE
FALSE
Using basic logic rules, you can see that the only combination of outcomes that results in a final value of TRUE being assigned to a candidate row is where both conditions evaluate to TRUE. Table 2-2 demonstrates the possible outcomes if the conditions had been separated by OR rather than AND.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Conditions and Expressions
Now that you understand how conditions are grouped together and evaluated, it's time to take a look at the different elements that make up a condition. A condition is comprised of one or more expressions along with one or more operators . Examples of expressions include:
  • Numbers
  • Columns, such as s.supplier_id
  • Literals, such as `Acme Industries'
  • Functions, such as UPPER(`abcd')
  • Lists of simple expressions, such as (1, 2, 3)
  • Subqueries
Examples of operators include:
  • Arithmetic operators, such as +, -, *, and /
  • Comparison operators, such as =, <, >=, !=, LIKE, and IN
The following sections explore many of the common condition types that use different combinations of the preceeding expression and operator types.
Most of the conditions found in a WHERE clause will be equality conditions used to join data sets together or to isolate specific values. You have already encountered these types of conditions numerous times in previous examples, including:
s.supplier_id = p.supplier_id

s.name = 'Acme Industries'

supplier_id = (SELECT supplier_id 
  FROM supplier 
  WHERE name = 'Acme Industries')
All three conditions are comprised of a column expression followed by a comparison operator (=) followed by another expression. The conditions differ in the type of expression on the right side of the comparison operator. The first example compares one column to another, the second example compares a column to a literal, and the third example compares a column to the value returned by a subquery.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
WHERE to Go from Here
This chapter has introduced the role of the WHERE clause in different types of SQL statements as well as the various components used to build a WHERE clause. Because the WHERE clause plays such an important role in many SQL statements, however, the topic is far from exhausted. Additional coverage of WHERE clause topics may be found in:
  • Chapter 3, in which various flavors of join conditions are studied in detail
  • Chapter 5, which probes the different types of subqueries along with the appropriate operators for evaluating their results
  • Chapter 6, in which various methods of handling date/time data are explored
  • Chapter 15, which explores certainaspects of the WHERE clause from the standpoint of performance andefficiency
Additionally, here are a few tips to help you make the most of your WHERE clauses:
  • Check your join conditions carefully. Make sure that each data set in the FROM clause is properly joined. Keep in mind that some joins require multiple conditions. See Chapter 3 for more information.
  • Avoid unnecessary joins. Just because two data sets in your FROM clause contain the same column does not necessitate a join condition be added to your FROM/WHERE clause. In some designs, redundant data has been propagated to multiple tables through a process called denormalization. Take the time to understand the database design, and ask your DBA or database designer for a current data model.
  • Use parentheses. Oracle maintains both operator precedence and condition precedence, meaning there are clearly defined rules for the order in which things will be evaluated, but the safest route for you and for those who will later maintain your code is to dictate evaluation order using parentheses. For operators, specifying
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Joins
Most things in life are not self-contained. There is not one shop where you will find all your requirements. This is valid for database tables as well. Quite often, you need information from more than one table. The SQL construct that combines data from two or more tables is called a join. This chapter takes you into the details of joins, their types, and their usage.
A join query extracts information from two or more tables or views. A join query differs from a regular query in at least the following two ways:
  • The FROM clause of a join query refers to two or more tables or views.
  • A condition is specified in the join query (known as join condition) that relates the rows of one table to the rows of another table.
The following example illustrates a simple join query:
            SELECT department.location_id, department.name, location.regional_group
            FROM department JOIN location
            ON department.location_id = location.location_id;

LOCATION_ID NAME                 REGIONAL_GROUP
----------- -------------------- ---------------
        122 ACCOUNTING           NEW YORK
        124 RESEARCH             DALLAS
        167 OPERATIONS           BOSTON
This example queries two tables. The department name is stored in the department table, whereas each department's region is stored in the location table. Notice the JOIN keyword between the two tables names in the FROM clause. The SELECT list may include columns from any of the tables specified in the FROM clause. The clause starting with the keyword ON specifies the join condition.
The syntax shown in the preceding example is the standard SQL join syntax supported from Oracle9i onwards. The Appendix A describes an older syntax that you should avoid using, but will often encounter in older code.
Usually, when you write a join query, you specify a condition that conveys a relationship between the tables specified in the FROM clause. This condition is referred to as the join condition. The join condition specifies how the rows from one table will be combined with the rows of another table. This join condition is usually applied to the foreign key columns of one table and the primary or unique key columns of another table. In the previous example, the ON clause specifies the join condition by which the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is a Join Query?
A join query extracts information from two or more tables or views. A join query differs from a regular query in at least the following two ways:
  • The FROM clause of a join query refers to two or more tables or views.
  • A condition is specified in the join query (known as join condition) that relates the rows of one table to the rows of another table.
The following example illustrates a simple join query:
            SELECT department.location_id, department.name, location.regional_group
            FROM department JOIN location
            ON department.location_id = location.location_id;

LOCATION_ID NAME                 REGIONAL_GROUP
----------- -------------------- ---------------
        122 ACCOUNTING           NEW YORK
        124 RESEARCH             DALLAS
        167 OPERATIONS           BOSTON
This example queries two tables. The department name is stored in the department table, whereas each department's region is stored in the location table. Notice the JOIN keyword between the two tables names in the FROM clause. The SELECT list may include columns from any of the tables specified in the FROM clause. The clause starting with the keyword ON specifies the join condition.
The syntax shown in the preceding example is the standard SQL join syntax supported from Oracle9i onwards. The Appendix A describes an older syntax that you should avoid using, but will often encounter in older code.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Join Conditions
Usually, when you write a join query, you specify a condition that conveys a relationship between the tables specified in the FROM clause. This condition is referred to as the join condition. The join condition specifies how the rows from one table will be combined with the rows of another table. This join condition is usually applied to the foreign key columns of one table and the primary or unique key columns of another table. In the previous example, the ON clause specifies the join condition by which the location_id column of the department table is equated with the location_id column of the location table:
            ON department.location_id = location.location_id;
         
To perform the join, Oracle picks up one combination of rows from the two tables, and checks to see whether the join condition is true. If the join condition is true, Oracle includes this combination of rows in the result set. This process is repeated for all combinations of rows from the two tables. Some of the things that you should know about join conditions are discussed in the following list:
  • The columns specified in a join condition need not be specified in the SELECT list. In the following example, the join condition involves the location_id column from the department and location tables; however, the location_id column is not selected:
    SELECT d.dept_id, d.name, l.regional_group
    FROM department d JOIN location l
    ON d.location_id = l.location_id;
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Types of Joins
There are several types of joins to be aware of:
Cross joins
Cross joins are joins without a join condition. Each row of one table is combined with each row of another table. The result is referred to as a Cartesian product.
Inner joins
Inner joins are the regular joins. An inner join returns the rows that satisfy the join condition. Each row returned by an inner join contains data from all the tables involved in the join.
Outer joins
Outer joins are an extension to inner joins. An outer join returns the rows that satisfy the join condition and also the rows from one table for which no corresponding rows (i.e., that satisfy the join condition) exist in the other table.
Equi- and non-equi-joins
An equi-join is a join where the join condition uses the equal to (=) operator to relate the rows of two tables. When a join condition uses any other operator to relate the rows of two tables, the join is called a non-equi-join.
Self joins
A self join is a join of a table to itself.
Partition outer joins
A new type of join introduced in Oracle Database 10g that is slated to be part of the next ANSI/ISO SQL standard after SQL:2003. A partition outer join divides your result set into groups, or partitions, and repeats the same outer join for each of these groups. Such joins are extremely handy for generating missing rows.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Joins and Subqueries
Joins can sometimes be used to good advantage in reformulating SELECT statements that would otherwise contain subqueries. Consider the problem of obtaining a list of suppliers of parts for which your inventory has dropped below 10 units. You might begin by writing a query such as the following:
SELECT supplier_id, name
FROM supplier s
WHERE EXISTS (SELECT *
              FROM part p
              WHERE p.inventory_qty < 10
                AND p.supplier_id = s.supplier_id);
The subquery in this SELECT statement is a correlated subquery, which means that it will be executed once for each row in the supplier table. Assuming that you have no indexes on the inventory_qty and supplier_id columns of the part table, this query could result in multiple, full-table scans of the part table. It's possible to restate the query using a join. For example:
SELECT s.supplier_id, s.name
FROM supplier s JOIN part p
ON p.supplier_id = s.supplier_id
WHERE p.inventory_qty < 10;
Whether the join version or the subquery version of a query is more efficient depends on the specific situation. It may be worth your while to test both approaches to see which query runs faster.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
DML Statements on a Join View
A join view is a view based on a join. Special considerations apply when you issue a DML (INSERT, UPDATE, or DELETE) statement against a join view. Ever thought about what happens when you insert a row into a join view—which table does the row go into? And what happens when you delete a row from a join view—from which table is it deleted? This section deals with these questions.
To be modifiable (also referred to as updatable), a join view must not contain any of the following:
  • Hierarchical query clauses, such as START WITH or CONNECT BY
  • GROUP BY or ORDER BY clauses
  • MODEL query
  • Set operations, such as UNION, UNION ALL, INTERSECT, MINUS
  • Aggregate functions, such as AVG, COUNT, MAX, MIN, SUM, and so on
  • Analytical functions, such as CUME_DIST, and so on
  • A subquery or a collection expression in the SELECT list
  • The DISTINCT operator
  • WITH READ ONLY option
  • The ROWNUM pseudocolumn
A DML statement on a join view can modify only one base table of the view. Thus, to be modifiable, a join view must also preserve a key from at least one of its tables.
A key-preserved table is the most important requirement for a join view to be modifiable. In a join, a table is called a key-preserved table if its keys are preserved through the join—every key of the table can also be a key of the resultant join result set. Every primary key or unique key value in the base table must also be unique in the result set of the join. Here's an example that better demonstrates the concept of key preserved tables:
               DESC employee
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Group Operations
Group operations are quite common in the day-to-day life of a SQL programmer. When you use SQL to access a database, it is quite common to expect questions such as:
  • What is the maximum salary in this department?
  • How many managers are there in each department?
  • What is the number of customers for each product?
  • Can you print the monthly aggregate sales for each region?
You need group operations to answer these questions. Oracle provides a rich set of features to handle group operations. These features include aggregate functions, the GROUP BY clause, the HAVING clause, and the extensions to the GROUP BY clause—ROLLUP, CUBE, and GROUPING SETS.
This chapter deals with simple group operations involving the aggregate functions, the GROUP BY and HAVING clauses. Advanced group operations such as ROLLUP, CUBE, and GROUPING SETS are discussed in Chapter 13.
An aggregate function summarizes the results of an expression over a number of rows, returning a single value. The general syntax for most of the aggregate functions is as follows:
            aggregate_function([DISTINCT | ALL] expression)
The syntax elements are:
aggregate_function
Gives the name of the function—e.g., SUM, COUNT, AVG, MAX, MIN
DISTINCT
Specifies that the aggregate function should consider only distinct values of the argument expression.
ALL
Specifies that the aggregate function should consider all values, including all duplicate values, of the argument expression. The default is ALL.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Aggregate Functions
An aggregate function summarizes the results of an expression over a number of rows, returning a single value. The general syntax for most of the aggregate functions is as follows:
            aggregate_function([DISTINCT | ALL] expression)
The syntax elements are:
aggregate_function
Gives the name of the function—e.g., SUM, COUNT, AVG, MAX, MIN
DISTINCT
Specifies that the aggregate function should consider only distinct values of the argument expression.
ALL
Specifies that the aggregate function should consider all values, including all duplicate values, of the argument expression. The default is ALL.
expression
Specifies a column, or any other expression, on which you want to perform the aggregation.
Let's look at a simple example. The following SQL uses the MAX function to find the maximum salary of all employees:
            SELECT MAX(salary) FROM employee;

MAX(SALARY)
-----------
       5000
In subsequent sections, we use a series of slightly more involved examples that illustrate various aspects of aggregate function behavior. For those examples, we use the following cust_order table:
            DESC cust_order

 Name                             Null?    Type
 -------------------------------- -------- --------------
 ORDER_NBR                        NOT NULL NUMBER(7)
 CUST_NBR                         NOT NULL NUMBER(5)
 SALES_EMP_ID                     NOT NULL NUMBER(5)
 SALE_PRICE                                NUMBER(9,2)
 ORDER_DT                         NOT NULL DATE
 EXPECTED_SHIP_DT                 NOT NULL DATE
 CANCELLED_DT                              DATE
 SHIP_DT                                   DATE
 STATUS                                    VARCHAR2(20)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The GROUP BY Clause
The GROUP BY clause, along with the aggregate functions, groups a result set into multiple groups, and then produces a single row of summary information for each group. For example, if you want to find the total number of orders for each customer, execute the following query:
            SELECT cust_nbr, COUNT(order_nbr) 
            FROM cust_order 
            GROUP BY cust_nbr;

  CUST_NBR COUNT(ORDER_NBR)
---------- ----------------
        1                8
        4                4
        5                6
        8                2
This query produces one summary line of output for each customer. This is the essence of a GROUP BY query. You asked Oracle to GROUP the results BY cust_nbr; therefore, it produced one output row for each distinct value of cust_nbr. Each data value for a given customer represents a summary based on all rows for that customer.
When you write a query with a GROUP BY clause, there are a number of rules you need to be aware of that govern the correspondence between the columns in the SELECT and GROUP BY clauses. Generally speaking, any nonaggregate expression in your SELECT clause must also be reflected in your GROUP BY clause.

Section 4.2.1.1: Aggregate expressions generally require a GROUP BY clause

The nonaggregate expression cust_nbr in the SELECT list of the query in our most recent example also appears in the GROUP BY clause. If you have a mix of aggregate and nonaggregate expressions in the SELECT list, SQL expects that you are trying to perform a GROUP BY operation, and you must also include a GROUP BY clause in your query. Oracle returns an error if you fail to do so. For example, if you omit the GROUP BY clause, the following error is returned:
                  SELECT cust_nbr, sales_emp_id, COUNT(order_nbr)
                  FROM cust_order;

SELECT cust_nbr, sales_emp_id, COUNT(order_nbr)
       *
ERROR at line 1:
ORA-00937: not a single-group group function
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The HAVING Clause
The HAVING clause is closely associated with the GROUP BY clause. The HAVING clause is used to put a filter on the groups created by the GROUP BY clause. If a query has a HAVING clause along with a GROUP BY clause, the result set will include only the groups that satisfy the condition specified in the HAVING clause. Let's look at some examples that illustrate this. The following query returns the number of orders per customer:
            SELECT cust_nbr, COUNT(order_nbr) 
            FROM cust_order 
            GROUP BY cust_nbr
            HAVING cust_nbr < 6;

  CUST_NBR COUNT(ORDER_NBR)
---------- ----------------
         1                8
         4                4
         5                6
Notice that the output only includes customers with numbers below 6. That's because the HAVING clause specified cust_nbr < 6 as a condition. Orders for all customers were counted, but only those groups that matched the specified HAVING condition were returned as the result.
The previous example is a poor use of the HAVING clause, because that clause references only unsummarized data. It's more efficient to use WHERE cust_nbr < 6 instead of HAVING cust_nbr < 6, because the WHERE clause eliminates rows prior to summarization, whereas HAVING eliminates groups post-summarization. A better version of the previous query would be:
SELECT cust_nbr, COUNT(order_nbr) 
FROM cust_order 
WHERE cust_nbr < 6
GROUP BY cust_nbr;
The next example shows a more appropriate use of the HAVING clause:
            SELECT cust_nbr, COUNT(order_nbr) 
            FROM cust_order 
            GROUP BY cust_nbr
            HAVING COUNT(order_nbr) > 2;

  CUST_NBR COUNT(ORDER_NBR)
---------- ----------------
         1                8
         4                4
         5                6
See the use of the aggregate function COUNT in the HAVING clause? This is an appropriate use for HAVING, because the results of the aggregate function cannot be determined until after the grouping takes place.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Nested Group Operations
The examples discussed in this chapter so far all involved one group operation on a column of a table. SQL also allows you to nest group functions, which means that one group function can enclose an expression that is itself a group operation on another expression or column. Let's consider the following example:
An economic slowdown has resulted in budget constraints for many employers, especially in the IT industry. Budget constraints have forced companies to take a second look at the money spent on employee compensation. Some companies have had to downsize their workforce, others have had to cut down employee bonuses, while still others have cut the employee base salaries. Your company is no exception, and is also under financial pressure. Your CEO must take a serious look at the compensation structure of the employees at various levels in various departments in the company.
Your CEO calls on you to query the Human Resources database and help him collect data. By this time, you've already heard the rumors of upcoming "reductions in force" (RIFs) and compensation cuts. This is your golden opportunity to impress your CEO with your skills, to make sure you are not affected by the RIFs. Mess up now, and you can be pretty sure that you have to start looking for another job in this increasingly competitive job market.
Here's your CEO's first question: What is the maximum amount of money spent by any department on employee salaries?
To answer this question, you know that you need to compute the sum of the salaries of all the employees in each department, and then find the maximum of those individual sums. Now that you know about the GROUP BY clause, finding the sum of salaries for all the employees in each department is easy:
            SELECT dept_id, SUM(salary) 
            FROM employee 
            GROUP BY dept_id;

   DEPT_ID SUM(SALARY)
---------- -----------
        10        8750
        20        9900
        30        9400
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Subqueries
Some endeavors require a certain level of preparation before the main activity can commence. Cooking, for example, often involves pre-mixing sets of ingredients before they are combined. Similarly, certain types of SQL statements benefit from the creation of intermediate result sets to aid in statement execution. The structure responsible for generating intermediate result sets is the subquery. This chapter will define and illustrate the use of subqueries in SQL statements.
As we mentioned in Chapter 1, a subquery is a SELECT statement that is nested within another SQL statement. For the purpose of this discussion, we will call the SQL statement that contains a subquery the containing statement. Subqueries are executed prior to execution of their containing SQL statement (see Section 5.3 later in this chapter for the exception to this rule), and the result set generated by a subquery is discarded after its containing SQL statement has finished execution. Thus, a subquery can be thought of as a tempora