Practical SQL, 2nd Edition

Book description

Practical SQL is an approachable and fast-paced guide to SQL (Structured Query Language), the standard programming language for defining, organizing, and exploring data in relational databases. Anthony DeBarros, a journalist and data analyst, focuses on using SQL to find the story within your data. The examples and code use the open-source database PostgreSQL and its companion pgAdmin interface, and the concepts you learn will apply to most database management systems, including MySQL, Oracle, SQLite, and others.*

You’ll first cover the fundamentals of databases and the SQL language, then build skills by analyzing data from real-world datasets such as US Census demographics, New York City taxi rides, and earthquakes from US Geological Survey. Each chapter includes exercises and examples that teach even those who have never programmed before all the tools necessary to build powerful databases and access information quickly and efficiently.

You’ll learn how to:

•Create databases and related tables using your own data
•Aggregate, sort, and filter data to find patterns
•Use functions for basic math and advanced statistical operations
•Identify errors in data and clean them up
•Analyze spatial data with a geographic information system (PostGIS)
•Create advanced queries and automate tasks

This updated second edition has been thoroughly revised to reflect the latest in SQL features, including additional advanced query techniques for wrangling data. This edition also has two new chapters: an expanded set of instructions on for setting up your system plus a chapter on using PostgreSQL with the popular JSON data interchange format.

Learning SQL doesn’t have to be dry and complicated. Practical SQL delivers clear examples with an easy-to-follow approach to teach you the tools you need to build and manage your own databases.

* Microsoft SQL Server employs a variant of the language called T-SQL, which is not covered by Practical SQL.

Table of contents

  1. Title Page
  2. Copyright
  3. About the Author
  4. Preface to the Second Edition
  5. Acknowledgments
  6. Introduction
    1. What Is SQL?
    2. Why SQL?
    3. Who Is This Book For?
    4. What You’ll Learn
  7. Chapter 1: Setting Up Your Coding Environment
    1. Installing a Text Editor
    2. Downloading Code and Data from GitHub
    3. Installing PostgreSQL and pgAdmin
      1. Windows Installation
      2. macOS Installation
      3. Linux Installation
    4. Working with pgAdmin
      1. Launching pgAdmin and Setting a Master Password
      2. Connecting to the Default postgres Database
      3. Exploring the Query Tool
      4. Customizing pgAdmin
    5. Alternatives to pgAdmin
    6. Wrapping Up
  8. Chapter 2: Creating Your First Database and Table
    1. Understanding Tables
    2. Creating a Database
      1. Executing SQL in pgAdmin
      2. Connecting to the analysis Database
    3. Creating a Table
      1. Using the CREATE TABLE Statement
      2. Making the teachers Table
    4. Inserting Rows into a Table
      1. Using the INSERT Statement
      2. Viewing the Data
    5. Getting Help When Code Goes Bad
    6. Formatting SQL for Readability
    7. Wrapping Up
  9. Chapter 3: Beginning Data Exploration with SELECT
    1. Basic SELECT Syntax
      1. Querying a Subset of Columns
    2. Sorting Data with ORDER BY
    3. Using DISTINCT to Find Unique Values
    4. Filtering Rows with WHERE
      1. Using LIKE and ILIKE with WHERE
      2. Combining Operators with AND and OR
    5. Putting It All Together
    6. Wrapping Up
  10. Chapter 4: Understanding Data Types
    1. Understanding Characters
    2. Understanding Numbers
      1. Using Integers
      2. Auto-Incrementing Integers
      3. Using Decimal Numbers
      4. Choosing Your Number Data Type
    3. Understanding Dates and Times
    4. Using the interval Data Type in Calculations
    5. Understanding JSON and JSONB
    6. Using Miscellaneous Types
    7. Transforming Values from One Type to Another with CAST
    8. Using CAST Shortcut Notation
    9. Wrapping Up
  11. Chapter 5: Importing and Exporting Data
    1. Working with Delimited Text Files
      1. Handling Header Rows
      2. Quoting Columns That Contain Delimiters
    2. Using COPY to Import Data
    3. Importing Census Data Describing Counties
      1. Creating the us_counties_pop_est_2019 Table
      2. Understanding Census Columns and Data Types
      3. Performing the Census Import with COPY
      4. Inspecting the Import
    4. Importing a Subset of Columns with COPY
    5. Importing a Subset of Rows with COPY
    6. Adding a Value to a Column During Import
    7. Using COPY to Export Data
      1. Exporting All Data
      2. Exporting Particular Columns
      3. Exporting Query Results
    8. Importing and Exporting Through pgAdmin
    9. Wrapping Up
  12. Chapter 6: Basic Math and Stats with SQL
    1. Understanding Math Operators and Functions
      1. Understanding Math and Data Types
      2. Adding, Subtracting, and Multiplying
      3. Performing Division and Modulo
      4. Using Exponents, Roots, and Factorials
      5. Minding the Order of Operations
    2. Doing Math Across Census Table Columns
      1. Adding and Subtracting Columns
      2. Finding Percentages of the Whole
      3. Tracking Percent Change
    3. Using Aggregate Functions for Averages and Sums
    4. Finding the Median
      1. Finding the Median with Percentile Functions
      2. Finding Median and Percentiles with Census Data
      3. Finding Other Quantiles with Percentile Functions
    5. Finding the Mode
    6. Wrapping Up
  13. Chapter 7: Joining Tables in a Relational Database
    1. Linking Tables Using JOIN
    2. Relating Tables with Key Columns
    3. Querying Multiple Tables Using JOIN
    4. Understanding JOIN Types
      1. JOIN
      2. LEFT JOIN and RIGHT JOIN
      3. FULL OUTER JOIN
      4. CROSS JOIN
    5. Using NULL to Find Rows with Missing Values
    6. Understanding the Three Types of Table Relationships
      1. One-to-One Relationship
      2. One-to-Many Relationship
      3. Many-to-Many Relationship
    7. Selecting Specific Columns in a Join
    8. Simplifying JOIN Syntax with Table Aliases
    9. Joining Multiple Tables
    10. Combining Query Results with Set Operators
      1. UNION and UNION ALL
      2. INTERSECT and EXCEPT
    11. Performing Math on Joined Table Columns
    12. Wrapping Up
  14. Chapter 8: Table Design That Works for You
    1. Following Naming Conventions
      1. Quoting Identifiers Enables Mixed Case
      2. Pitfalls with Quoting Identifiers
      3. Guidelines for Naming Identifiers
    2. Controlling Column Values with Constraints
      1. Primary Keys: Natural vs. Surrogate
      2. Foreign Keys
      3. How to Automatically Delete Related Records with CASCADE
      4. The CHECK Constraint
      5. The UNIQUE Constraint
      6. The NOT NULL Constraint
      7. How to Remove Constraints or Add Them Later
    3. Speeding Up Queries with Indexes
      1. B-Tree: PostgreSQL’s Default Index
      2. Considerations When Using Indexes
    4. Wrapping Up
  15. Chapter 9: Extracting Information by Grouping and Summarizing
    1. Creating the Library Survey Tables
      1. Creating the 2018 Library Data Table
      2. Creating the 2017 and 2016 Library Data Tables
    2. Exploring the Library Data Using Aggregate Functions
      1. Counting Rows and Values Using count()
      2. Finding Maximum and Minimum Values Using max() and min()
      3. Aggregating Data Using GROUP BY
    3. Wrapping Up
  16. Chapter 10: Inspecting and Modifying Data
    1. Importing Data on Meat, Poultry, and Egg Producers
    2. Interviewing the Dataset
      1. Checking for Missing Values
      2. Checking for Inconsistent Data Values
      3. Checking for Malformed Values Using length()
    3. Modifying Tables, Columns, and Data
      1. Modifying Tables with ALTER TABLE
      2. Modifying Values with UPDATE
      3. Viewing Modified Data with RETURNING
      4. Creating Backup Tables
      5. Restoring Missing Column Values
      6. Updating Values for Consistency
      7. Repairing ZIP Codes Using Concatenation
      8. Updating Values Across Tables
    4. Deleting Unneeded Data
      1. Deleting Rows from a Table
      2. Deleting a Column from a Table
      3. Deleting a Table from a Database
    5. Using Transactions to Save or Revert Changes
    6. Improving Performance When Updating Large Tables
    7. Wrapping Up
  17. Chapter 11: Statistical Functions in SQL
    1. Creating a Census Stats Table
      1. Measuring Correlation with corr(Y, X)
      2. Checking Additional Correlations
      3. Predicting Values with Regression Analysis
      4. Finding the Effect of an Independent Variable with r-Squared
      5. Finding Variance and Standard Deviation
    2. Creating Rankings with SQL
      1. Ranking with rank() and dense_rank()
      2. Ranking Within Subgroups with PARTITION BY
    3. Calculating Rates for Meaningful Comparisons
      1. Finding Rates of Tourism-Related Businesses
    4. Smoothing Uneven Data
    5. Wrapping Up
  18. Chapter 12: Working with Dates and Times
    1. Understanding Data Types and Functions for Dates and Times
    2. Manipulating Dates and Times
      1. Extracting the Components of a timestamp Value
      2. Creating Datetime Values from timestamp Components
      3. Retrieving the Current Date and Time
    3. Working with Time Zones
      1. Finding Your Time Zone Setting
      2. Setting the Time Zone
    4. Performing Calculations with Dates and Times
      1. Finding Patterns in New York City Taxi Data
      2. Finding Patterns in Amtrak Data
    5. Wrapping Up
  19. Chapter 13: Advanced Query Techniques
    1. Using Subqueries
      1. Filtering with Subqueries in a WHERE Clause
      2. Creating Derived Tables with Subqueries
      3. Joining Derived Tables
      4. Generating Columns with Subqueries
      5. Understanding Subquery Expressions
      6. Using Subqueries with LATERAL
    2. Using Common Table Expressions
    3. Performing Cross Tabulations
      1. Installing the crosstab() Function
      2. Tabulating Survey Results
      3. Tabulating City Temperature Readings
    4. Reclassifying Values with CASE
    5. Using CASE in a Common Table Expression
    6. Wrapping Up
  20. Chapter 14: Mining Text to Find Meaningful Data
    1. Formatting Text Using String Functions
      1. Case Formatting
      2. Character Information
      3. Removing Characters
      4. Extracting and Replacing Characters
    2. Matching Text Patterns with Regular Expressions
      1. Regular Expression Notation
      2. Using Regular Expressions with WHERE
      3. Regular Expression Functions to Replace or Split Text
      4. Turning Text to Data with Regular Expression Functions
    3. Full-Text Search in PostgreSQL
      1. Text Search Data Types
      2. Creating a Table for Full-Text Search
      3. Searching Speech Text
      4. Ranking Query Matches by Relevance
    4. Wrapping Up
  21. Chapter 15: Analyzing Spatial Data with PostGIS
    1. Enabling PostGIS and Creating a Spatial Database
    2. Understanding the Building Blocks of Spatial Data
    3. Understanding Two-Dimensional Geometries
      1. Well-Known Text Formats
      2. Projections and Coordinate Systems
      3. Spatial Reference System Identifier
    4. Understanding PostGIS Data Types
    5. Creating Spatial Objects with PostGIS Functions
      1. Creating a Geometry Type from Well-Known Text
      2. Creating a Geography Type from Well-Known Text
      3. Using Point Functions
      4. Using LineString Functions
      5. Using Polygon Functions
    6. Analyzing Farmers’ Markets Data
      1. Creating and Filling a Geography Column
      2. Adding a Spatial Index
      3. Finding Geographies Within a Given Distance
      4. Finding the Distance Between Geographies
      5. Finding the Nearest Geographies
    7. Working with Census Shapefiles
      1. Understanding the Contents of a Shapefile
      2. Loading Shapefiles
      3. Exploring the Census 2019 Counties Shapefile
      4. Examining Demographics Within a Distance
    8. Performing Spatial Joins
      1. Exploring Roads and Waterways Data
      2. Joining the Census Roads and Water Tables
      3. Finding the Location Where Objects Intersect
    9. Wrapping Up
  22. Chapter 16: Working with JSON Data
    1. Understanding JSON Structure
    2. Considering When to Use JSON with SQL
    3. Using json and jsonb Data Types
    4. Importing and Indexing JSON Data
    5. Using json and jsonb Extraction Operators
      1. Key Value Extraction
      2. Array Element Extraction
      3. Path Extraction
      4. Containment and Existence
    6. Analyzing Earthquake Data
      1. Exploring and Loading the Earthquake Data
      2. Working with Earthquake Times
      3. Finding the Largest and Most-Reported Earthquakes
      4. Converting Earthquake JSON to Spatial Data
    7. Generating and Manipulating JSON
      1. Turning Query Results into JSON
      2. Adding, Updating, and Deleting Keys and Values
    8. Using JSON Processing Functions
      1. Finding the Length of an Array
      2. Returning Array Elements as Rows
    9. Wrapping Up
  23. Chapter 17: Saving Time with Views, Functions, and Triggers
    1. Using Views to Simplify Queries
      1. Creating and Querying Views
      2. Creating and Refreshing a Materialized View
      3. Inserting, Updating, and Deleting Data Using a View
    2. Creating Your Own Functions and Procedures
      1. Creating the percent_change() Function
      2. Using the percent_change() Function
      3. Updating Data with a Procedure
      4. Using the Python Language in a Function
    3. Automating Database Actions with Triggers
      1. Logging Grade Updates to a Table
      2. Automatically Classifying Temperatures
    4. Wrapping Up
  24. Chapter 18: Using PostgreSQL from the Command Line
    1. Setting Up the Command Line for psql
      1. Windows psql Setup
      2. macOS psql Setup
      3. Linux psql Setup
    2. Working with psql
      1. Launching psql and Connecting to a Database
      2. Running SQL Queries on psql
      3. Navigating and Formatting Results
      4. Meta-Commands for Database Information
      5. Importing, Exporting, and Using Files
    3. Additional Command Line Utilities to Expedite Tasks
      1. Adding a Database with createdb
      2. Loading Shapefiles with shp2pgsql
    4. Wrapping Up
  25. Chapter 19: Maintaining Your Database
    1. Recovering Unused Space with VACUUM
      1. Tracking Table Size
      2. Monitoring the Autovacuum Process
      3. Running VACUUM Manually
      4. Reducing Table Size with VACUUM FULL
    2. Changing Server Settings
      1. Locating and Editing postgresql.conf
      2. Reloading Settings with pg_ctl
    3. Backing Up and Restoring Your Database
      1. Using pg_dump to Export a Database or Table
      2. Restoring a Database Export with pg_restore
      3. Exploring Additional Backup and Restore Options
    4. Wrapping Up
  26. Chapter 20: Telling Your Data’s Story
    1. Start with a Question
    2. Document Your Process
    3. Gather Your Data
    4. No Data? Build Your Own Database
    5. Assess the Data’s Origins
    6. Interview the Data with Queries
    7. Consult the Data’s Owner
    8. Identify Key Indicators and Trends over Time
    9. Ask Why
    10. Communicate Your Findings
    11. Wrapping Up
  27. Appendix: Additional PostgreSQL Resources
    1. PostgreSQL Development Environments
    2. PostgreSQL Utilities, Tools, and Extensions
    3. PostgreSQL News and Community
    4. Documentation
  28. Index

Product information

  • Title: Practical SQL, 2nd Edition
  • Author(s): Anthony DeBarros
  • Release date: March 2022
  • Publisher(s): No Starch Press
  • ISBN: 9781718501065