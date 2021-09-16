Book description
With the explosion of data, computing power, and cloud data warehouses, SQL has become an even more indispensable tool for the savvy analyst or data scientist. This practical book reveals new and hidden ways to improve your SQL skills, solve problems, and make the most of SQL as part of your workflow.
You'll learn how to use both common and exotic SQL functions such as joins, window functions, subqueries, and regular expressions in new, innovative ways--as well as how to combine SQL techniques to accomplish your goals faster, with understandable code. If you work with SQL databases, this is a must-have reference.
- Learn the key steps for preparing your data for analysis
- Perform time series analysis using SQL's date and time manipulations
- Use cohort analysis to investigate how groups change over time
- Use SQL's powerful functions and operators for text analysis
- Detect outliers in your data and replace them with alternate values
- Establish causality using experiment analysis, also known as A/B testing
Table of contents
- Preface
- 1. Analysis with SQL
-
2. Preparing Data for Analysis
- Types of Data
- SQL Query Structure
- Profiling: Distributions
- Profiling: Data Quality
- Preparing: Data Cleaning
- Preparing: Shaping Data
- Conclusion
-
3. Time Series Analysis
- Date, Datetime, and Time Manipulations
- The Retail Sales Data Set
- Trending the Data
- Rolling Time Windows
- Analyzing with Seasonality
- Conclusion
-
4. Cohort Analysis
- Cohorts: A Useful Analysis Framework
- The Legislators Data Set
- Retention
- Related Cohort Analyses
- Cross-Section Analysis, Through a Cohort Lens
- Conclusion
-
5. Text Analysis
- Why Text Analysis with SQL?
- The UFO Sightings Data Set
- Text Characteristics
- Text Parsing
- Text Transformations
- Finding Elements Within Larger Blocks of Text
- Constructing and Reshaping Text
- Conclusion
-
6. Anomaly Detection
- Capabilities and Limits of SQL for Anomaly Detection
- The Data Set
- Detecting Outliers
- Forms of Anomalies
- Handling Anomalies
- Conclusion
-
7. Experiment Analysis
- Strengths and Limits of Experiment Analysis with SQL
- The Data Set
- Types of Experiments
- Challenges with Experiments and Options for Rescuing Flawed Experiments
- When Controlled Experiments Aren’t Possible: Alternative Analyses
- Conclusion
-
8. Creating Complex Data Sets for Analysis
- When to Use SQL for Complex Data Sets
- Code Organization
- Organizing Computations
- Managing Data Set Size and Privacy Concerns
- Conclusion
- 9. Conclusion
- Index
Product information
- Title: SQL for Data Analysis
- Author(s):
- Release date: September 2021
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492088783
