O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Practical Time Series Analysis

Book Description

Time series data analysis is increasingly important due to the massive production of such data through the internet of things, the digitalization of healthcare, and the rise of smart cities. As continuous monitoring and data collection become more common, the need for competent time series analysis with both statistical and machine learning techniques will increase.

Covering innovations in time series data analysis and use cases from the real world, this practical guide will help you solve the most common data engineering and analysis challengesin time series, using both traditional statistical and modern machine learning techniques. Author Aileen Nielsen offers an accessible, well-rounded introduction to time series in both R and Python that will have data scientists, software engineers, and researchers up and running quickly.

You’ll get the guidance you need to confidently:

  • Find and wrangle time series data
  • Undertake exploratory time series data analysis
  • Store temporal data
  • Simulate time series data
  • Generate and select features for a time series
  • Measure error
  • Forecast and classify time series with machine or deep learning
  • Evaluate accuracy and performance

Table of Contents

  1. Preface
    1. Who Should Read This Book
      1. Expected Background
    2. Why I Wrote This Book
    3. Navigating This Book
    4. Online Resources
    5. Conventions Used in This Book
    6. Using Code Examples
    7. O’Reilly Online Learning
    8. How to Contact Us
    9. Acknowledgments
  2. 1. Time Series: An Overview and a Quick History
    1. The History of Time Series in Diverse Applications
      1. Medicine as a Time Series Problem
      2. Forecasting Weather
      3. Forecasting Economic Growth
      4. Astronomy
    2. Time Series Analysis Takes Off
    3. The Origins of Statistical Time Series Analysis
    4. The Origins of Machine Learning Time Series Analysis
    5. More Resources
  3. 2. Finding and Wrangling Time Series Data
    1. Where to Find Time Series Data
      1. Prepared Data Sets
      2. Found Time Series
    2. Retrofitting a Time Series Data Collection from a Collection of Tables
      1. A Worked Example: Assembling a Time Series Data Collection
      2. Constructing a Found Time Series
    3. Timestamping Troubles
      1. Whose Timestamp?
      2. Guesstimating Timestamps to Make Sense of Data
      3. What’s a Meaningful Time Scale?
    4. Cleaning Your Data
      1. Handling Missing Data
      2. Upsampling and Downsampling
      3. Smoothing Data
    5. Seasonal Data
    6. Time Zones
    7. Preventing Lookahead
    8. More Resources
  4. 3. Exploratory Data Analysis for Time Series
    1. Familiar Methods
      1. Plotting
      2. Histograms
      3. Scatter Plots
    2. Time Series–Specific Exploratory Methods
      1. Understanding Stationarity
      2. Applying Window Functions
      3. Understanding and Identifying Self-Correlation
      4. Spurious Correlations
    3. Some Useful Visualizations
      1. 1D Visualizations
      2. 2D Visualizations
      3. 3D Visualizations
    4. More Resources
  5. 4. Simulating Time Series Data
    1. What’s Special About Simulating Time Series?
      1. Simulation Versus Forecasting
    2. Simulations in Code
      1. Doing the Work Yourself
      2. Building a Simulation Universe That Runs Itself
      3. A Physics Simulation
    3. Final Notes on Simulations
      1. Statistical Simulations
      2. Deep Learning Simulations
    4. More Resources
  6. 5. Storing Temporal Data
    1. Defining Requirements
      1. Live Data Versus Stored Data
    2. Database Solutions
      1. SQL Versus NoSQL
      2. Popular Time Series Database and File Solutions
    3. File Solutions
      1. NumPy
      2. Pandas
      3. Standard R Equivalents
      4. Xarray
    4. More Resources
  7. 6. Statistical Models for Time Series
    1. Why Not Use a Linear Regression?
    2. Statistical Methods Developed for Time Series
      1. Autoregressive Models
      2. Moving Average Models
      3. Autoregressive Integrated Moving Average Models
      4. Vector Autoregression
      5. Variations on Statistical Models
    3. Advantages and Disadvantages of Statistical Methods for Time Series
    4. More Resources
  8. 7. State Space Models for Time Series
    1. State Space Models: Pluses and Minuses
    2. The Kalman Filter
      1. Overview
      2. Code for the Kalman Filter
    3. Hidden Markov Models
      1. How the Model Works
      2. How We Fit the Model
      3. Fitting an HMM in Code
    4. Bayesian Structural Time Series
      1. Code for bsts
    5. More Resources
  9. 8. Generating and Selecting Features for a Time Series
    1. Introductory Example
    2. General Considerations When Computing Features
      1. The Nature of the Time Series
      2. Domain Knowledge
      3. External Considerations
    3. A Catalog of Places to Find Features for Inspiration
      1. Open Source Time Series Feature Generation Libraries
      2. Domain-Specific Feature Examples
    4. How to Select Features Once You Have Generated Them
    5. Concluding Thoughts
    6. More Resources
  10. 9. Machine Learning for Time Series
    1. Time Series Classification
      1. Selecting and Generating Features
      2. Decision Tree Methods
    2. Clustering
      1. Generating Features from the Data
      2. Temporally Aware Distance Metrics
      3. Clustering Code
    3. More Resources
  11. 10. Deep Learning for Time Series
    1. Deep Learning Concepts
    2. Programming a Neural Network
      1. Data, Symbols, Operations, Layers, and Graphs
    3. Building a Training Pipeline
      1. Inspecting Our Data Set
      2. Steps of a Training Pipeline
    4. Feed Forward Networks
      1. A Simple Example
      2. Using an Attention Mechanism to Make Feed Forward Networks More Time-Aware
    5. CNNs
      1. A Simple Convolutional Model
      2. Alternative Convolutional Models
    6. RNNs
      1. Continuing Our Electric Example
      2. The Autoencoder Innovation
    7. Combination Architectures
    8. Summing Up
    9. More Resources
  12. 11. Measuring Error
    1. The Basics: How to Test Forecasts
      1. Model-Specific Considerations for Backtesting
    2. When Is Your Forecast Good Enough?
    3. Estimating Uncertainty in Your Model with a Simulation
    4. Predicting Multiple Steps Ahead
      1. Fit Directly to the Horizon of Interest
      2. Recursive Approach to Distant Temporal Horizons
      3. Multitask Learning Applied to Time Series
    5. Model Validation Gotchas
    6. More Resources
  13. 12. Performance Considerations in Fitting and Serving Time Series Models
    1. Working with Tools Built for More General Use Cases
      1. Models Built for Cross-Sectional Data Don’t “Share” Data Across Samples
      2. Models That Don’t Precompute Create Unnecessary Lag Between Measuring Data and Making a Forecast
    2. Data Storage Formats: Pluses and Minuses
      1. Store Your Data in a Binary Format
      2. Preprocess Your Data in a Way That Allows You to “Slide” Over It
    3. Modifying Your Analysis to Suit Performance Considerations
      1. Using All Your Data Is Not Necessarily Better
      2. Complicated Models Don’t Always Do Better Enough
      3. A Brief Mention of Alternative High-Performance Tools
    4. More Resources
  14. 13. Healthcare Applications
    1. Predicting the Flu
      1. A Case Study of Flu in One Metropolitan Area
      2. What Is State of the Art in Flu Forecasting?
    2. Predicting Blood Glucose Levels
      1. Data Cleaning and Exploration
      2. Generating Features
      3. Fitting a Model
    3. More Resources
  15. 14. Financial Applications
    1. Obtaining and Exploring Financial Data
    2. Preprocessing Financial Data for Deep Learning
      1. Adding Quantities of Interest to Our Raw Values
      2. Scaling Quantities of Interest Without a Lookahead
      3. Formatting Our Data for a Neural Network
    3. Building and Training an RNN
    4. More Resources
  16. 15. Time Series for Government
    1. Obtaining Governmental Data
    2. Exploring Big Time Series Data
      1. Upsample and Aggregate the Data as We Iterate Through It
      2. Sort the Data
    3. Online Statistical Analysis of Time Series Data
      1. Remaining Questions
      2. Further Improvements
    4. More Resources
  17. 16. Time Series Packages
    1. Forecasting at Scale
      1. Google’s Industrial In-house Forecasting
      2. Facebook’s Open Source Prophet Package
    2. Anomaly Detection
      1. Twitter’s Open Source AnomalyDetection Package
    3. Other Time Series Packages
    4. More Resources
  18. 17. Forecasts About Forecasting
    1. Forecasting as a Service
    2. Deep Learning Enhances Probabilistic Possibilities
    3. Increasing Importance of Machine Learning Rather Than Statistics
    4. Increasing Combination of Statistical and Machine Learning Methodologies
    5. More Forecasts for Everyday Life
  19. Index