Data Analysis Using SQL and Excel, 2nd Edition

Book description

A practical guide to data mining using SQL and Excel

Data Analysis Using SQL and Excel, 2nd Edition shows you how to leverage the two most popular tools for data query and analysis—SQL and Excel—to perform sophisticated data analysis without the need for complex and expensive data mining tools. Written by a leading expert on business data mining, this book shows you how to extract useful business information from relational databases. You'll learn the fundamental techniques before moving into the "where" and "why" of each analysis, and then learn how to design and perform these analyses using SQL and Excel. Examples include SQL and Excel code, and the appendix shows how non-standard constructs are implemented in other major databases, including Oracle and IBM DB2/UDB. The companion website includes datasets and Excel spreadsheets, and the book provides hints, warnings, and technical asides to help you every step of the way.

Data Analysis Using SQL and Excel, 2nd Edition shows you how to perform a wide range of sophisticated analyses using these simple tools, sparing you the significant expense of proprietary data mining tools like SAS.

  • Understand core analytic techniques that work with SQL and Excel
  • Ensure your analytic approach gets you the results you need
  • Design and perform your analysis using SQL and Excel

Data Analysis Using SQL and Excel, 2nd Edition shows you how to best use the tools you already know to achieve expert results.

Table of contents

  1. Foreword
  2. Introduction
    1. Overview of the Book and Technology
    2. How This Book Is Organized
    3. Who Should Read This Book
    4. Tools You Will Need
    5. What’s on the Website
    6. Summary
  3. Chapter 1: A Data Miner Looks at SQL
    1. Databases, SQL, and Big Data
    2. Picturing the Structure of the Data
    3. Picturing Data Analysis Using Dataflows
    4. SQL Queries
    5. Subqueries and Common Table Expressions Are Our Friends
    6. Lessons Learned
  4. Chapter 2: What’s in a Table? Getting Started with Data Exploration
    1. What Is Data Exploration?
    2. Excel for Charting
    3. Sparklines
    4. What Values Are in the Columns?
    5. More Values to Explore—Min, Max, and Mode
    6. Exploring String Values
    7. Exploring Values in Two Columns
    8. From Summarizing One Column to Summarizing All Columns
    9. Lessons Learned
  5. Chapter 3: How Different Is Different?
    1. Basic Statistical Concepts
    2. How Different Are the Averages?
    3. Sampling from a Table
    4. Counting Possibilities
    5. Ratios and Their Statistics
    6. Chi-Square
    7. What Months and Payment Types Have Unusual Affinities for Which Types of Products?
    8. Lessons Learned
  6. Chapter 4: Where Is It All Happening? Location, Location, Location
    1. Latitude and Longitude
    2. Census Demographics
    3. Geographic Hierarchies
    4. Mapping in Excel
    5. Lessons Learned
  7. Chapter 5: It’s a Matter of Time
    1. Dates and Times in Databases
    2. Starting to Investigate Dates
    3. How Long Between Two Dates?
    4. Year-over-Year Comparisons
    5. Counting Active Customers by Day
    6. Simple Chart Animation in Excel
    7. Lessons Learned
  8. Chapter 6: How Long Will Customers Last? Survival Analysis to Understand Customers and Their Value
    1. Background on Survival Analysis
    2. The Hazard Calculation
    3. Survival and Retention
    4. Comparing Different Groups of Customers
    5. Comparing Survival over Time
    6. Important Measures Derived from Survival
    7. Using Survival for Customer Value Calculations
    8. Forecasting
    9. Lessons Learned
  9. Chapter 7: Factors Affecting Survival: The What and Why of Customer Tenure
    1. Which Factors Are Important and When
    2. Left Truncation
    3. Time Windowing
    4. Competing Risks
    5. Before and After
    6. Lessons Learned
  10. Chapter 8: Customer Purchases and Other Repeated Events
    1. Identifying Customers
    2. RFM Analysis
    3. Which Households Are Increasing Purchase Amounts Over Time?
    4. Time to Next Event
    5. Lessons Learned
  11. Chapter 9: What’s in a Shopping Cart? Market Basket Analysis
    1. Exploring the Products
    2. Products and Customer Worth
    3. Product Geographic Distribution
    4. Which Customers Have Particular Products?
    5. Lessons Learned
  12. Chapter 10: Association Rules and Beyond
    1. Item Sets
    2. The Simplest Association Rules
    3. One-Way Association Rules
    4. Two-Way Associations
    5. Extending Association Rules
    6. Lessons Learned
  13. Chapter 11: Data Mining Models in SQL
    1. Introduction to Directed Data Mining
    2. Look-Alike Models
    3. Lookup Model for Most Popular Product
    4. Lookup Model for Order Size
    5. Lookup Model for Probability of Response
    6. Naïve Bayesian Models (Evidence Models)
    7. Lessons Learned
  14. Chapter 12: The Best-Fit Line: Linear Regression Models
    1. The Best-Fit Line
    2. Measuring Goodness of Fit Using R2
    3. Direct Calculation of Best-Fit Line Coefficients
    4. Weighted Linear Regression
    5. More Than One Input Variable
    6. Lessons Learned
  15. Chapter 13: Building Customer Signatures for Further Analysis
    1. What Is a Customer Signature?
    2. Designing Customer Signatures
    3. Operations to Build Customer Signatures
    4. Extracting Features
    5. Summarizing Customer Behaviors
    6. Lessons Learned
  16. Chapter 14: Performance Is the Issue: Using SQL Effectively
    1. Query Engines and Performance
    2. Using Indexes Effectively
    3. When OR Is a Bad Thing
    4. Pros and Cons: Different Ways of Expressing the Same Thing
    5. Window Functions
    6. Lessons Learned
  17. Appendix Equivalent Constructs Among Databases
    1. String Functions
    2. Date/Time Functions
    3. Mathematical Functions
    4. Other Functions and Features
  18. EULA

Product information

  • Title: Data Analysis Using SQL and Excel, 2nd Edition
  • Author(s): Gordon S. Linoff
  • Release date: December 2015
  • Publisher(s): Wiley
  • ISBN: 9781119021438