BUY THIS BOOK
Add to Cart

Print Book $34.95


Add to Cart

PDF $27.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £24.95

What is this?

Looking to Reprint or License this content?


Learning SQL
Learning SQL

By Alan Beaulieu
Book Price: $34.95 USD
£24.95 GBP
PDF Price: $27.99

Cover | Table of Contents


Table of Contents

Chapter 1: A Little Background
Before we roll up our sleeves and get to work, it might be beneficial to introduce some basic database concepts and look at the history of computerized data storage and retrieval.
A database is nothing more than a set of related information. A telephone book, for example, is a database of the names, phone numbers, and addresses of all people living in a particular region. While a telephone book is certainly a ubiquitous and frequently used database, it suffers from the following:
  • Finding a person's telephone number can be time consuming, especially if the telephone book contains a large number of entries.
  • A telephone book is only indexed by last/first names, so finding the names of the people living at a particular address, while possible in theory, is not a practical use for this database.
  • From the moment the telephone book is printed, the information becomes less and less accurate as people move into or out of a region, change their telephone numbers, or move to another location within the same region.
The same drawbacks attributed to telephone books can also apply to any manual data storage system, such as patient records stored in a filing cabinet. Because of the cumbersome nature of paper databases, some of the first computer applications developed were database systems, which are computerized data storage and retrieval mechanisms. Because a database system stores data electronically rather than on paper, a database system is able to retrieve data more quickly, index data in multiple ways, and deliver up-to-the-minute information to its user community.
Early database systems managed data stored on magnetic tapes. Because there were generally far more tapes than tape readers, technicians were tasked with loading and unloading tapes as specific data was required. Because the computers of that era had very little memory, multiple requests for the same data generally required the data to be read from the tape multiple times. While these database systems were a significant improvement over paper databases, they are a far cry from what is possible with today's technology. (Modern database systems can manage terabytes of data spread across many fast-access disk drives, holding tens of gigabytes of that data in high-speed memory, but I'm getting a bit ahead of myself.)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Introduction to Databases
A database is nothing more than a set of related information. A telephone book, for example, is a database of the names, phone numbers, and addresses of all people living in a particular region. While a telephone book is certainly a ubiquitous and frequently used database, it suffers from the following:
  • Finding a person's telephone number can be time consuming, especially if the telephone book contains a large number of entries.
  • A telephone book is only indexed by last/first names, so finding the names of the people living at a particular address, while possible in theory, is not a practical use for this database.
  • From the moment the telephone book is printed, the information becomes less and less accurate as people move into or out of a region, change their telephone numbers, or move to another location within the same region.
The same drawbacks attributed to telephone books can also apply to any manual data storage system, such as patient records stored in a filing cabinet. Because of the cumbersome nature of paper databases, some of the first computer applications developed were database systems, which are computerized data storage and retrieval mechanisms. Because a database system stores data electronically rather than on paper, a database system is able to retrieve data more quickly, index data in multiple ways, and deliver up-to-the-minute information to its user community.
Early database systems managed data stored on magnetic tapes. Because there were generally far more tapes than tape readers, technicians were tasked with loading and unloading tapes as specific data was required. Because the computers of that era had very little memory, multiple requests for the same data generally required the data to be read from the tape multiple times. While these database systems were a significant improvement over paper databases, they are a far cry from what is possible with today's technology. (Modern database systems can manage terabytes of data spread across many fast-access disk drives, holding tens of gigabytes of that data in high-speed memory, but I'm getting a bit ahead of myself.)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is SQL?
Along with Codd's definition of the relational model, he proposed a language called DSL/Alpha for manipulating the data in relational tables. Shortly after Codd's paper was released, IBM commissioned a group to build a prototype based on Codd's ideas. This group created a simplified version of DSL/Alpha that they called SQUARE. Refinements to SQUARE led to a language called SEQUEL, which was, finally, renamed SQL.
SQL is now entering its fourth decade, and it has undergone a great deal of change along the way. In the mid 1980s, the American National Standards Institute (ANSI) began working on the first standard for the SQL language, which was published in 1986. Subsequent refinements led to new releases of the SQL standard in 1989, 1992, 1999, and 2003. Along with refinements to the core language, new features have been added to the SQL language to incorporate object-oriented functionality, among other things.
SQL goes hand-in-hand with the relational model because the result of an SQL query is a table (also called, in this context, a result set). Thus, a new permanent table can be created in a relational database simply by storing the result set of a query. Similarly, a query can use both permanent tables and the result sets from other queries as inputs (this will be explored in detail in Chapter 9).
One final note: SQL is not an acronym for anything (although many people will insist it stands for "Structured Query Language"). When referring to the language, it is equally acceptable to say the letters individually (i.e., S. Q. L.) or to use the word "sequel."
The SQL language is broken into several distinct parts: the parts that will be explored in this book include SQL schema statements, which are used to define the data structures stored in the database; SQL data statements, which are used to manipulate the data structures previously defined using SQL schema statements; and SQL transaction statements, which are used to begin, end, and rollback transactions (covered in Chapter 12). For example, to create a new table in your database, you would use the SQL schema statement
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is MySQL?
Relational databases have been available commercially for over two decades. Some of the most mature and popular products include:
  • Oracle Database from Oracle Corporation
  • SQL Server from Microsoft
  • DB2 Universal Database from IBM
  • Sybase Adaptive Server from Sybase
  • Informix Dynamic Server from IBM
All of these database servers do approximately the same thing, although some are better equipped to run very large or very-high-throughput databases. Others are better at handling objects or very large files or XML documents, etc. Additionally, all of these servers do a pretty good job of being compliant with the latest ANSI SQL standard. This is a good thing, and I will make it a point to show you how to write SQL statements that will run on any of these platforms with little or no modification.
Along with the commercial database servers, there has been quite a bit of activity in the open-source community in the past five years with the goal of creating a viable alternative to the commercial database servers. Two of the most commonly used open-source database servers are PostgreSQL and MySQL. The MySQL web site (http://www.mysql.com) currently claims over 6 million installations, their server is available for free, and I have found it extremely simple to download and install their server. For these reasons, I have decided that all examples for this book will be run against a MySQL (Version 4.1.11) database, and that the mysql command-line tool will be used to format query results. Even if you are already using another server and never plan to use MySQL, I urge you to install the latest MySQL server, load the sample schema and data, and experiment with the data and examples in this book.
However, keep in mind the following caveat:
This is not a book about MySQL's SQL implementation.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What's in Store
The overall goal of the next four chapters is to introduce the SQL data statements, with a special emphasis of the three main clauses of the select statement. Additionally, you will see many examples that use the bank schema (introduced in the next chapter), which will be used for all examples in the book. It is my hope that your growing familiarity with a single database will allow you to get to the crux of an example without your having to stop and examine the tables being used each time.
After you have a solid grasp on the basics, the remaining chapters will drill deep on additional concepts, most of which are independent of each other. Thus, if you find yourself getting confused, you can always move ahead and come back later to revisit a chapter. When you have finished the book and worked through all of the examples, you will be well on your way to becoming a seasoned SQL practitioner.
For those readers interested in learning more about relational databases, the history of computerized database systems, or the SQL language than was covered in this short introduction, here are a few resources worth checking out:
  • Database in Depth: Relational Theory for Practitioners by C.J. Date (O'Reilly)
  • An Introduction to Database Systems, Eighth Edition by C.J. Date (Addison Wesley)
  • The Database Relational Model: A Retrospective Review and Analysis: A Historical Account and Assessment of E. F. Codd's Contribution to the Field of Database Technology by C.J. Date (Addison Wesley)
  • http://en.wikipedia.org/wiki/Database_management_system
  • http://www.mcjones.org/System_R/
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Creating and Populating a Database
This chapter will provide you with the information you need to create your first database and to create the tables and associated data used for the examples in this book. You will also learn about various data types and see how to create tables using them. Because the examples in this book are executed against a MySQL database, this chapter is somewhat skewed toward MySQL's features and syntax, but most concepts are applicable to any server.
If you already have a MySQL database server available for your use, you can start with item number 8 in the instructions below. Keep in mind, however, that this book assumes that you are using MySQL Version 4.1.11 or later, so you may want to consider upgrading your server or installing another server if you are using an earlier release.
The following instructions will show you the minimum steps required to install a MySQL server on a Windows computer, create a database, and load the sample data for this book:
  1. Download the MySQL Database Server (Version 4.1.7 or later) from http://dev.mysql.com. Unless you are planning to use the server for more than just a training tool, you should download the Essentials Package, which includes only the commonly used tools, instead of the Complete Package.
  2. Launch the installation by double-clicking on the downloaded file.
  3. Install the server using the "typical install." The installation should be quick and painless, but feel free to consult the online installation guide at http://dev.mysql.com/doc/mysql/en/Installing.html.
  4. When the installation is complete, make sure the checkbox is checked next to "Configure the MySQL Server now" before pressing the finish button. This will launch the Configuration Wizard.
  5. When the Configuration Wizard launches, choose the Standard Configuration radio button, and then check both the Install as Windows Service and Include Bin Directory in Windows Path checkboxes.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Creating a MySQL Database
If you already have a MySQL database server available for your use, you can start with item number 8 in the instructions below. Keep in mind, however, that this book assumes that you are using MySQL Version 4.1.11 or later, so you may want to consider upgrading your server or installing another server if you are using an earlier release.
The following instructions will show you the minimum steps required to install a MySQL server on a Windows computer, create a database, and load the sample data for this book:
  1. Download the MySQL Database Server (Version 4.1.7 or later) from http://dev.mysql.com. Unless you are planning to use the server for more than just a training tool, you should download the Essentials Package, which includes only the commonly used tools, instead of the Complete Package.
  2. Launch the installation by double-clicking on the downloaded file.
  3. Install the server using the "typical install." The installation should be quick and painless, but feel free to consult the online installation guide at http://dev.mysql.com/doc/mysql/en/Installing.html.
  4. When the installation is complete, make sure the checkbox is checked next to "Configure the MySQL Server now" before pressing the finish button. This will launch the Configuration Wizard.
  5. When the Configuration Wizard launches, choose the Standard Configuration radio button, and then check both the Install as Windows Service and Include Bin Directory in Windows Path checkboxes.
  6. During the configuration, you will be asked to choose a password for the root user. Make sure you write down the password for future use.
  7. Open a shell (use Start Run Command) and log in as the root user: mysql -u root -p. You will be asked for the root password, and then the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Using the mysql Command-Line Tool
Whenever you invoke the mysql command-line tool, you can specify the username and database to use, as in the following:
    mysql -u lrngsql -p bank
You will be asked for your password, and then the mysql> prompt will appear, via which you will be able to issue SQL statements and view the results. For example, if you want to know the current date and time, you could issue the following query:
    mysql> SELECT now();
    +---------------------+
    | now()              |
    +---------------------+
    | 2005-05-06 16:48:46 |
    +---------------------+
    1 row in set (0.01 sec)
The now() function is a built-in MySQL function that returns the current date and time. As you can see, the mysql command-line tool formats the results of your queries within a rectangle bounded by +, -, and | characters . After the results have been exhausted (in this case, there is only a single row of results), the mysql command-line tool shows how many rows were returned and how long the SQL statement took to execute.
When you are done with the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
MySQL Data Types
In general, all of the popular database servers have the capacity to store the same types of data, such as strings, dates, and numbers. Where they typically differ is in the specialty data types, such as XML documents or very large text or binary documents. Since this is an introductory book on SQL, and since 98% of the columns you encounter will be simple data types, this book will concern itself only with the character, date, and numeric data types.
Character data can either be stored as fixed-length or variable-length strings, the difference being that fixed-length strings are right-padded with spaces, whereas variable-length strings are not. When defining a character column, you must specify the maximum size of any string to be stored in the column. For example, if you want to store strings up to 20 characters in length, you could use either of the following definitions:
    CHAR(20)    /* fixed-length */
    VARCHAR(20) /* variable-length */
The maximum length for these data types is currently 255 characters (although upcoming releases will allow for longer strings). If you need to store longer strings (such as emails, XML documents, etc.), then you will want to use one of the text types (tinytext, text, mediumtext, longtext), which are covered later in this section. In general, you should use the char type when all strings to be stored in the column are of the same length, such as state abbreviations, and the varchar type when strings to be stored in the column are of varying lengths. Both char and varchar are used in a similar fashion in all of the major database servers.
Oracle Database is an exception when it comes to the use of varchar. Oracle users should use the varchar2 type when defining variable-length character columns.

Section 2.3.1.1: Character sets

For languages that use the Latin alphabet, such as English, there is a sufficiently small number of characters such that only a single byte is needed to store each character. Other languages, such as Japanese and Korean, contain large numbers of characters, thus requiring multiple bytes of storage for each character. Such character sets are therefore called
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Table Creation
Now that you have a firm grasp on what data types may be stored in a MySQL database, it's time to see how to use these types in table definitions. Let's start by defining a table to hold information about a person.
A good way to start designing a table is to do a bit of brainstorming to see what kind of information would be helpful to include. Here's what I came up with after thinking for a short time about the types of information that describe a person:
  • Name
  • Gender
  • Birth date
  • Address
  • Favorite foods
This is certainly not an exhaustive list, but it's good enough for now. The next step is to assign column names and data types. Table 2-6 shows a first pass at these.
Table 2-6: Person table, first pass
Column
Type
Allowable values
Name
Varchar(40)
Gender
Char(1)
M, F
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Populating and Modifying Tables
With the person and favorite_food tables in place, you can now begin to explore the four SQL data statements: insert, update, delete, and select.
Since there is not yet data in the person or favorite_food tables, the first of the four SQL data statements to be explored will be the insert statement. There are three main components to an insert statement:
  • The name of the table into which to add the data
  • The names of the columns in the table to be populated
  • The values with which to populate the columns
Therefore, you are not required to provide data for every column in the table (unless all the columns in the table have been defined as not null). In some cases, those columns that are not included in the initial insert statement will be given a value later via an update statement. In other cases, a column may never receive a value for a particular row of data (such as a customer order that is cancelled before being shipped, thus rendering the ship_date column inapplicable).

Section 2.5.1.1: Generating numeric key data

Before inserting data into the person table, it would be useful to discuss how values are generated for numeric primary keys. Other than picking a number out of thin air, you have a couple of options:
  • Look at the largest value currently in the table and add 1.
  • Let the database server provide the value for you.
While the first option may seem valid, it proves problematic in a multiuser environment, since two users might look at the table at the same time and generate the same value for the primary key. Instead, all database servers on the market today provide a safer, more robust method for generating numeric keys. In some cases, such as the Oracle Database, a separate schema object is used (called a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
When Good Statements Go Bad
So far, all of the SQL data statements shown in this chapter have been well-formed and have played by the rules. Based on the table definitions for the person and favorite_food tables, however, there are lots of ways that you can run afoul when inserting or modifying data. This section shows you some of the common mistakes that you might come across and how the MySQL server will respond.
Because the table definitions include the creation of primary-key constraints, MySQL will make sure that duplicate values are not inserted into the tables. The next statement attempts to bypass the auto-increment feature of the person_id column and create another row in the person table with a person_id of 1:
    mysql> INSERT INTO person
        ->  (person_id, fname, lname, gender, birth_date)
        -> VALUES (1, 'Charles','Fulton', 'M', '1968-01-15');
    ERROR 1062 (23000): Duplicate entry '1' for key 1
There is nothing stopping you (with the current schema objects, at least) from creating two rows with identical names, addresses, birth dates, etc., as long as they have different values for the person_id column.
The table definition for the favorite_food table includes the creation of a foreign-key constraint on the person_id column. This constraint ensures that all values of person_id entered into the favorite_food table already exist in the person table. Here's what would happen if you tried to create a row that violates this constraint:
    mysql> INSERT INTO favorite_food (person_id, food)
        -> VALUES (999, 'lasagna');
               
    ERROR 1216 (23000): Cannot add or update a child row:
      a foreign key constraint fails
            
In this case, the favorite_food table is considered the child, and the person table is considered the parent, since the favorite_food table is dependent on the person table for its data. If you plan to enter data into both tables, you will need to create a row in
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Bank Schema
For the remainder of the book, you will be using a group of tables that model a community bank. Some of the tables include Employee, Branch, Account, Customer, Product, Transaction, and Loan. The entire schema and example data should have been created when you followed the 13 steps at the beginning of the chapter for loading the MySQL server and generating the sample data. To see a diagram of the tables and their columns and relationships, see Appendix A.
Table 2-9 shows all of the tables used in the bank schema along with short definitions.
Table 2-9: Bank schema definitions
Table name
Definition
Account
A particular product opened for a particular customer
Business
A corporate customer (subtype of the Customer table)
Customer
A person or corporation known to the bank
Department
A group of bank employees implementing a particular banking function
Employee
A person working for the bank
Individual
A noncorporate customer (subtype of the Customer table)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Query Primer
So far, you have seen a few examples of database queries (a.k.a. select statements) sprinkled throughout the first two chapters. Now it's time to take a closer look at the different parts of the select statement and how they interact.
Before dissecting the select statement, it might be interesting to look at how queries are executed by the MySQL server (or, for that matter, any database server). If you are using the mysql command-line tool (which I assume you are), then you have already logged in to the MySQL server by providing your username and password (and possibly a hostname if the MySQL server is running on a different computer). Once the server has verified that your username and password are correct, a database connection is generated for you to use. This connection is held by the application that requested it (which, in this case, is the mysql tool) until either the application releases the connection (i.e., as a result of your typing quit) or the server closes the connection (i.e., when the server is shut down). Each connection to the MySQL server is assigned an identifier, which is shown to you when you first log in:
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 2 to server version: 4.1.11-nt

    Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
In this case, my connection ID is 2. This information might be useful to your database administrator if something goes awry, such as a malformed query that runs for hours, so you might want to jot it down.
Once the server has verified your username and password and issued you a connection, you are ready to execute queries (along with other SQL statements). Each time a query is sent to the server, the server checks the following things prior to statement execution:
  • Do you have permission to execute the statement?
  • Do you have permission to access the desired data?
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Query Mechanics
Before dissecting the select statement, it might be interesting to look at how queries are executed by the MySQL server (or, for that matter, any database server). If you are using the mysql command-line tool (which I assume you are), then you have already logged in to the MySQL server by providing your username and password (and possibly a hostname if the MySQL server is running on a different computer). Once the server has verified that your username and password are correct, a database connection is generated for you to use. This connection is held by the application that requested it (which, in this case, is the mysql tool) until either the application releases the connection (i.e., as a result of your typing quit) or the server closes the connection (i.e., when the server is shut down). Each connection to the MySQL server is assigned an identifier, which is shown to you when you first log in:
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 2 to server version: 4.1.11-nt

    Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
In this case, my connection ID is 2. This information might be useful to your database administrator if something goes awry, such as a malformed query that runs for hours, so you might want to jot it down.
Once the server has verified your username and password and issued you a connection, you are ready to execute queries (along with other SQL statements). Each time a query is sent to the server, the server checks the following things prior to statement execution:
  • Do you have permission to execute the statement?
  • Do you have permission to access the desired data?
  • Is your statement syntax correct?
If your statement passes these three tests, then your query is handed to the query optimizer, whose job it is to determine the most efficient way to execute your query. The optimizer will look at such things as the order in which to join the tables named in the query and what indexes are available, and then picks an
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Query Clauses
There are several components or clauses that make up the select statement. While only one of them is mandatory when using MySQL (the select clause), you will usually include at least two or three of the six available clauses. Table 3-1 shows the different clauses and their purposes.
Table 3-1: Query clauses
Clause name
Purpose
Select
Determines which columns to include in the query's result set
From
Identifies the tables from which to draw data and how the tables should be joined
Where
Restricts the number of rows in the final result set
Group by
Used to group rows together by common column values
Having
Restricts the number of rows in the final result set using grouped data
Order by
Sorts the rows of the final result set by one or more columns
All of the clauses shown in Table 3-1 are included in the ANSI specification; additionally, there are several other clauses unique to MySQL that will be explored in Appendix B. The following sections delve into the uses of the six major query clauses.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The select Clause
Even though the select clause is the first clause of a select statement, it is one of the last clauses to be evaluated by the database server. The reason for this is that before you can determine what to include in the final result set, you need to know all of the possible columns that could be included in the final result set. In order to fully understand the role of the select clause, therefore, you will need to understand a bit about the from clause. Here's a query to get started:
            
    mysql> SELECT *
        -> FROM department;
    +---------+----------------+
    | dept_id | name           |
    +---------+----------------+
    |       1 | Operations     |
    |       2 | Loans          |
    |       3 | Administration |
    +---------+----------------+
    3 rows in set (0.04 sec)
In this query, the from clause lists a single table (department), and the select clause indicates that all columns (designated by "*") in the department table should be included in the result set. This query could be described in English as follows:
Show me all the columns in the department table.
In addition to specifying all of the columns via the asterisk character, you can explicitly name the columns you are interested in, such as:
            
    mysql> SELECT dept_id, name
        -> FROM department;
    +---------+----------------+
    | dept_id | name           |
    +---------+----------------+
    |       1 | Operations     |
    |       2 | Loans          |
    |       3 | Administration |
    +---------+----------------+
    3 rows in set (0.01 sec)
The results are identical to the first query, since all of the columns in the department table (dept_id and name) are named in the select clause. You can choose to include only a subset of the columns in the department table as well:
            
    mysql> SELECT name
        -> FROM department;
    +----------------+
    | name           |
    +----------------+
    | Operations     |
    | Loans          |
    | Administration |
    +----------------+
    3 rows in set (0.00 sec)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The from Clause
Thus far, you have seen queries whose from clauses contain a single table. Although most SQL books will define the from clause as simply a list of one or more tables, I would like to broaden the definition as follows:
The from clause defines the tables used by a query, along with the means of linking the tables together.
This definition is composed of two separate but related concepts, which will be explored in the following sections.
When confronted with the term table, most people think of a set of related rows stored in a database. While this does describe one type of table, I would like to use the word in a more general way by removing any notion of how the data might be stored and concentrating on just the set of related rows. There are three different types of tables that meet this relaxed definition:
  • Permanent tables (i.e., created using the create table statement)
  • Temporary tables (i.e., rows returned by a subquery)
  • Virtual tables (i.e., created using the create view statement)
Each of these table types may be included in a query's from clause. By now, you should be comfortable with including a permanent table in a from clause, so I will briefly describe the other types of tables that can be referenced in a from clause.

Section 3.4.1.1: Subquery-generated tables

A subquery is a query contained within another query. Subqueries are surrounded by parentheses and can be found in various parts of a select statement; within the from clause, however, a subquery serves the role of generating a temporary table that is visible from all other query clauses and can interact with other tables named in the from clause. Here's a simple example:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The where Clause
The queries shown thus far in the chapter have selected every row from the employee, department, or account tables (except for the demonstration of distinct earlier in the chapter). Most of the time, however, you will not wish to retrieve every row from a table but will want a way to filter out those rows that are not of interest. This is a job for the where clause.
The where clause is the mechanism for filtering out unwanted rows from your result set.
For example, perhaps you are interested in retrieving data from the employee table, but only for those employees who are employed as head tellers. The following query employs a where clause to retrieve only the four head tellers:
            
    mysql> SELECT emp_id, fname, lname, start_date, title
        -> FROM employee
        -> WHERE title = 'Head Teller';
    +--------+---------+---------+------------+-------------+
    | emp_id | fname   | lname   | start_date | title       |
    +--------+---------+---------+------------+-------------+
    |      6 | Helen   | Fleming | 2004-03-17 | Head Teller |
    |     10 | Paula   | Roberts | 2002-07-27 | Head Teller |
    |     13 | John    | Blake   | 2000-05-11 | Head Teller |
    |     16 | Theresa | Markham | 2001-03-15 | Head Teller |
    +--------+---------+---------+------------+-------------+
    4 rows in set (0.00 sec)
In this case, 14 of the 18 employee rows were filtered out by the where clause. This where clause contains a single filter condition, but you can include as many conditions as required; individual conditions are separated using operators such as and, or, and not (see Chapter 4 for a complete discussion of the where clause and filter conditions). Here's an extension of the previous query that includes a second condition stating that only those employees with a start date later than January 1, 2002, should be included:
            
    mysql> SELECT emp_id, fname, lname, start_date, title
        -> FROM employee
        -> WHERE title = 'Head Teller'
        ->   AND start_date > '2002-01-01';
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The group by and having Clauses
All of the queries thus far have retrieved raw data without any manipulation. Sometimes, however, you will want to find trends in your data that will require the database server to cook the data a bit before you retrieve your result set. One such mechanism is the group by clause, which is used to group data by column values. For example, rather than looking at a list of employees and the departments to which they are assigned, you might want to look at a list of departments along with the number of employees assigned to each department. When using the group by clause, you may also use the having clause, which allows you to filter group data in the same way the where clause lets you filter raw data.
I wanted to briefly mention these two clauses so they don't catch you by surprise later in the book, but they are a bit more advanced than the other four select clauses. Therefore, I ask that you wait until Chapter 8 for a full description of how and when to use group by and having.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The order by Clause
In general, the rows in a result set returned from a query are not in any particular order. If you want to your result set in a particular order, you will need to instruct the server to sort the results using the order by clause:
The order by clause is the mechanism for sorting your result set using either raw column data or expressions based on column data.
For example, here's another look at an earlier query against the account table:
            
    mysql> SELECT open_emp_id, product_cd
        -> FROM account;
    +-------------+------------+
    | open_emp_id | product_cd |
    +-------------+------------+
    |          10 | CHK        |
    |          10 | SAV        |
    |          10 | CD         |
    |          10 | CHK        |
    |          10 | SAV        |
    |          13 | CHK        |
    |          13 | MM         |
    |           1 | CHK        |
    |           1 | SAV        |
    |           1 | MM         |
    |          16 | CHK        |
    |           1 | CHK        |
    |           1 | CD         |
    |          10 | CD         |
    |          16 | CHK        |
    |          16 | SAV        |
    |           1 | CHK        |
    |           1 | MM         |
    |           1 | CD         |
    |          16 | CHK        |
    |          16 | BUS        |
    |          10 | BUS        |
    |          16 | CHK        |
    |          13 | SBL        |
    +-------------+------------+
    24 rows in set (0.00 sec)
If you are trying to analyze data for each employee, it would be helpful to sort the results by the open_emp_id column; to do so, simply add this column to the order by clause:
            
    mysql> SELECT open_emp_id, product_cd
        -> FROM account
        -> ORDER BY open_emp_id;
    +-------------+------------+
    | open_emp_id | product_cd |
    +-------------+------------+
    |           1 | CHK        |
    |           1 | SAV        |
    |           1 | MM         |
    |           1 | CHK        |
    |           1 | CD         |
    |           1 | CHK        |
    |           1 | MM         |
    |           1 | CD         |
    |          10 | CHK        |
    |          10 | SAV        |
    |          10 | CD         |
    |          10 | CHK        |
    |          10 | SAV        |
    |          10 | CD         |
    |          10 | BUS        |
    |          13 | CHK        |
    |          13 | MM         |
    |          13 | SBL        |
    |          16 | CHK        |
    |          16 | CHK        |
    |          16 | SAV        |
    |          16 | CHK        |
    |          16 | BUS        |
    |          16 | CHK        |
    +-------------+------------+
    24 rows in set (0.00 sec)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exercises
The following exercises are designed to strengthen your understanding of the select statement and its various clauses. Please see Appendix C for solutions.
Retrieve the employee ID, first name, and last name for all bank employees. Sort by last name then first name.
Retrieve the account ID, customer ID, and available balance for all accounts whose status equals 'ACTIVE' and whose available balance is greater than $2,500.
Write a query against the account table that returns the IDs of the employees who opened the accounts (use the account.open_emp_id column). Include a single row for each distinct employee.
Fill in the blanks (denoted by <#>) for this multi-data-set query to achieve the results shown below:
    mysql> SELECT p.product_cd, a.cust_id, a.avail_balance
        -> FROM product p INNER JOIN account <1>
        ->   ON p.product_cd = <2>
        -> WHERE p.<3> = 'ACCOUNT';
    +------------+---------+---------------+
    | product_cd | cust_id | avail_balance |
    +------------+---------+---------------+
    | CD         |       1 |       3000.00 |
    | CD         |       6 |      10000.00 |
    | CD         |       7 |       5000.00 |
    | CD         |       9 |       1500.00 |
    | CHK        |       1 |       1057.75 |
    | CHK        |       2 |       2258.02 |
    | CHK        |       3 |       1057.75 |
    | CHK        |       4 |        534.12 |
    | CHK        |       5 |       2237.97 |
    | CHK        |       6 |        122.37 |
    | CHK        |       8 |       3487.19 |
    | CHK        |       9 |        125.67 |
    | CHK        |      10 |      23575.12 |
    | CHK        |      12 |      38552.05 |
    | MM         |       3 |       2212.50 |
    | MM         |       4 |       5487.09 |
    | MM         |       9 |       9345.55 |
    | SAV        |       1 |        500.00 |
    | SAV        |       2 |        200.00 |
    | SAV        |       4 |        767.77 |
    | SAV        |       8 |        387.99 |
    +------------+---------+---------------+
    21 rows in set (0.02 sec)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Filtering
There are some instances when you will want to work with every row in a table, such as:
  • Purging all data from a table used to stage new data warehouse feeds.
  • Modifying all rows in a table after a new column has been added.
  • Retrieving all rows from a message queue table.
In cases like these, your SQL statements won't need to have a where clause, since you don't need to exclude any rows from consideration. Most of the time, however, you will want to narrow your focus to a subset of a table's rows. Therefore, all of the SQL data statements (except the insert statement) include an optional where clause to house all filter conditions used to restrict the number of rows acted on by the SQL statement. Additionally, the select statement includes a having clause in which filter conditions pertaining to grouped data may be included. This chapter will explore the various types of filter conditions that can be employed in the where clauses of select, update, and delete statements.
A where clause may contain one or more conditions, separated by the operators and and or. If there are multiple conditions separated only by the and operator, then all of the conditions must evaluate to true for the row to be included in the result set. Consider the following where clause:
    WHERE title = 'Teller' AND start_date < '2003-01-01'
Given these two conditions, any employee who is either not a teller or began working for the bank in 2003 or later will be removed from consideration. While this example uses only two conditions, no matter how many conditions are in your where clause, if they are separated by the and operator then they must all evaluate to true for the row to be included in the result set.
If all conditions in the where clause are separated by the or operator, however, then only one of the conditions must evaluate to true for the row to be included in the result set. Consider the following two conditions:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Condition Evaluation
Content preview·Buy PDF of this chap