book

The Art of SQL

by Stephane Faroult, Peter Robson

March 2006

Intermediate to advanced

367 pages

11h 20m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Dedication
Preface
Why Another SQL Book?AudienceAssumptions This Book MakesContents of This BookConventions Used in This BookUsing Code ExamplesComments and QuestionsSafari® EnabledAcknowledgments
1. Laying Plans
1.1. The Relational View of Data1.2. The Importance of Being Normal1.2.1. Step 1: Ensure Atomicity1.2.2. Step 2: Check Dependence on the Whole Key1.2.3. Step 3: Check Attribute Independence1.3. To Be or Not to Be, or to Be Null1.4. Qualifying Boolean Columns1.5. Understanding Subtypes1.6. Stating the Obvious1.7. The Dangers of Excess Flexibility1.8. The Difficulties of Historical Data1.9. Design and Performance1.10. Processing Flow1.11. Centralizing Your Data1.12. System Complexity1.13. The Completed Plans
2. Waging War
2.1. Query Identification2.2. Stable Database Connections2.3. Strategy Before Tactics2.4. Problem Definition Before Solution2.5. Stable Database Schema2.6. Operations Against Actual Data2.7. Set Processing in SQL2.8. Action-Packed SQL Statements2.9. Profitable Database Accesses2.10. Closeness to the DBMS Kernel2.11. Doing Only What Is Required2.12. SQL Statements Mirror Business Logic2.13. Program Logic into Queries2.14. Multiple Updates at Once2.15. Careful Use of User-Written Functions2.16. Succinct SQL2.17. Offensive Coding with SQL2.18. Discerning Use of Exceptions
3. Tactical Dispositions
3.1. The Identification of “Entry Points”3.2. Indexes and Content Lists3.3. Making Indexes Work3.4. Indexes with Functions and Conversions3.5. Indexes and Foreign Keys3.6. Multiple Indexing of the Same Columns3.7. System-Generated Keys3.8. Variability of Index Accesses
4. Maneuvering
4.1. The Nature of SQL4.1.1. SQL and Databases4.1.2. SQL and the Optimizer4.1.3. Limits of the Optimizer4.2. Five Factors Governing the Art of SQL4.2.1. Total Quantity of Data4.2.2. Criteria Defining the Result Set4.2.3. Size of the Result Set4.2.4. Number of Tables4.2.4.1. Joins4.2.4.2. Complex queries and complex views4.2.5. Number of Other Users4.3. Filtering4.3.1. Meaning of Filtering Conditions4.3.2. Evaluation of Filtering Conditions4.3.2.1. Buyers of Batmobiles4.3.2.2. More Batmobile purchases4.3.2.3. Lessons to be learned from the Batmobile trade4.3.3. Querying Large Quantities of Data4.3.4. The Proportions of Retrieved Data
5. Terrain
5.1. Structural Types5.2. The Conflicting Goals5.3. Considering Indexes as Data Repositories5.4. Forcing Row Ordering5.5. Automatically Grouping Data5.5.1. Round-Robin Partitioning5.5.2. Data-Driven Partitioning5.6. The Double-Edged Sword of Partitioning5.7. Partitioning and Data Distribution5.8. The Best Way to Partition Data5.9. Pre-Joining Tables5.10. Holy Simplicity
6. The Nine Situations
6.1. Small Result Set, Direct Specific Criteria6.1.1. Index Usability6.1.2. Query Efficiency and Index Usage6.1.3. Data Dispersion6.1.4. Criterion Indexability6.2. Small Result Set, Indirect Criteria6.3. Small Intersection of Broad Criteria6.4. Small Intersection, Indirect Broad Criteria6.5. Large Result Set6.6. Self-Joins on One Table6.7. Result Set Obtained by Aggregation6.8. Simple or Range Searching on Dates6.8.1. Many Items, Few Historical Values6.8.1.1. Using subqueries6.8.1.2. Using OLAP functions6.8.2. Many Historical Values Per Item6.8.3. Current Values6.9. Result Set Predicated on Absence of Data
7. Variations in Tactics
7.1. Tree Structures7.1.1. Tree Structures Versus Master/Detail Relationships7.1.2. Practical Examples of Hierarchies7.2. Representing Trees in an SQL Database7.3. Practical Implementation of Trees7.3.1. Adjacency Model7.3.2. Materialized Path Model7.3.3. Nested Sets Model (After Celko)7.4. Walking a Tree with SQL7.4.1. Top-Down Walk: The Vandamme Query7.4.1.1. Adjacency model7.4.1.2. Materialized path model7.4.1.3. Nested sets model7.4.1.4. Comparing the Vandamme query under the various models7.4.2. Bottom-Up Walk: The Highlanders Query7.4.2.1. Adjacency model7.4.2.2. Materialized path model7.4.2.3. Nested sets model7.4.2.4. Comparing the various models for the Highlanders query7.5. Aggregating Values from Trees7.5.1. Aggregation of Values Stored in Leaf Nodes7.5.1.1. Modeling head counts7.5.1.2. Computing head counts at every level7.5.2. Propagation of Percentages Across Different Levels
8. Weaknesses and Strengths
8.1. Deceiving Criteria8.2. Abstract Layers8.3. Distributed Systems8.4. Dynamically Defined Search Criteria8.4.1. Designing a Simple Movie Database and the Main Query8.4.2. Right-Sizing Queries8.4.3. Wrapping SQL in PHP

9. Multiple Fronts
9.1. The Database Engine as a Service Provider9.1.1. The Virtues of Indexes9.1.2. A Just-So Story9.1.3. Get in Line9.2. Concurrent Data Changes9.2.1. Locking9.2.1.1. Locking granularity9.2.1.2. Lock handling9.2.1.3. Locking and committing9.2.1.4. Locking and scalability9.2.2. Contention9.2.2.1. Insertion and contention9.2.2.2. DBA solutions9.2.2.3. Architectural solutions9.2.2.4. Development solutions9.2.2.5. Results
10. Assembly of Forces
10.1. Increasing Volumes10.1.1. Sensitivity of Operations to Volume Increases10.1.1.1. Insensitivity to volume increase10.1.1.2. Linear sensitivity to volume increases10.1.1.3. Non-linear sensitivity to volume increases10.1.1.4. Putting it all together10.1.1.5. Disentangling subqueries10.1.2. Partitioning to the Rescue10.1.3. Data Purges10.2. Data Warehousing10.2.1. Facts and Dimensions: the Star Schema10.2.2. Query Tools10.2.3. Extraction, Transformation, and Loading10.2.3.1. Data extraction10.2.3.2. Transformation10.2.3.3. Loading10.2.3.4. Integrity constraints and indexes10.2.4. Querying Dimensions and Facts: Ad Hoc Reports10.2.4.1. The star transformation10.2.4.2. Emulating the star transformation10.2.4.3. Querying a star schema the way it is not intended to be queried10.2.5. A (Strong) Word of Caution
11. Stratagems
11.1. Turning Data Around11.1.1. Rows That Should Have Been Columns11.1.2. Columns That Should Have Been Rows11.1.2.1. Creating a pivot table11.1.2.2. Multiplying rows with a pivot table11.1.2.3. Using pivot table values11.1.2.4. The pivot and unpivot operators11.1.3. Single Columns That Should Have Been Something Else11.1.3.1. First normal form on the fly11.1.3.2. Lifting the veil on the Chapter 7 mystery path explosion11.2. Querying with a Variable in List11.3. Aggregating by Range (Bands)11.4. Superseding a General Case11.5. Selecting Rows That Match Several Items in a List11.6. Finding the Best Match11.7. Optimizer Directives
12. Employment of Spies
12.1. The Database Is Slow12.2. The Components of Server Load12.3. Defining Good Performance12.3.1. Knowing What You Spend12.3.2. Knowing What You Get12.3.3. Checking Against Acknowledged Standards12.3.4. Defining Performance Goals12.4. Thinking in Business Tasks12.5. Execution Plans12.5.1. Identifying the Fastest Execution Plan12.5.1.1. Our contestants12.5.1.2. Our battle field12.5.1.3. And the winner is.. .12.5.2. Forcing the Right Execution Plan12.5.2.1. A stubborn query12.5.2.2. Study of search criteria12.5.2.3. A moral to the story12.6. Using Execution Plans Properly12.6.1. How Not to Execute a Query12.6.2. Hidden Complexity12.7. What Really Matters?
PHOTO CREDITS
About the Authors
About the Author
Copyright

Content preview from The Art of SQL

Chapter 4. Maneuvering

Thinking SQL Statements

There is only one principle of war, and that’s this. Hit the other fellow, as quickly as you can, as hard as you can, where it hurts him most, when he ain’t lookin’.
—Field Marshal Sir William Slim (1891-1970) quoting an anonymous Sergeant-Major

In this chapter, we are going to take a close look at the SQL query and examine how its construct can vary according to the tactical demands of particular situations. This will involve examining complex queries and reviewing how they can be decomposed into a succession of smaller components, all interdependent, and all contributing to a final, complete query.

The Nature of SQL

Before we begin examining query constructs in detail, we need to review some of the general characteristics of SQL itself: how it relates to the database engine and the associated optimizer, and what may limit the efficiency of the optimizer.

SQL and Databases

Relational databases owe their existence to pioneering work by E.F. Codd on the relational theory. From the outset, Codd’s work provided a very strong mathematical basis to what had so far been a mostly empirical discipline. To make an analogy, for thousands of years mankind has built bridges to span rivers, but frequently these structures were grossly overengineered simply because the master builders of the time didn’t fully understand the true relationships between the materials they used to build their bridges, and the consequent strengths of these bridges. Once the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 0596008945Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

The Art of SQL

by Stephane Faroult, Peter Robson