O'Reilly logo
live online training icon Live Online training

Beginning Time Series Analysis and Forecasting

Topic: Data
Jeffrey Yau

Time series forecasting is both a fascinating subject to study and an important technique applied in industry, government, and academic settings. Example applications include demand and inventory planning, marketing strategy planning, capital budgeting, pricing, machine predictive maintenance, and macroeconomic forecasting, just to name a few. Forecasting typically requires time series data, and time series data is ubiquitous nowadays, both within and outside of the data science field, such as weekly initial unemployment claims, tick-level stock prices, weekly company sales, daily number of steps taken recorded by a wearable, machine performance measurements recorded by sensors, and key performance indicators of business functions.

This training provides an introduction to time series analysis and forecasting, covering the key differences between time series data and cross-sectional data, manipulation of time series data, exploratory time series data analysis using statistics (and their graphical representations), and one of the most important classes of statistical time series models; AutoRegression Integrated Moving Average (ARIMA) models and its Seasonal counterpart (SARIMA) with and without explanatory variables. As some of the most important and commonly used data science techniques to analyze time series data and make forecast based on them are those developed in the field of statistics and machine learning, this introductory time series training provides the practical foundations for conducting time series analysis and forecasting.

What you'll learn-and how you can apply it

  • The key characteristics, which are distinguished from non-time series data, of time series data
  • Statistics for summarizing time series
  • Graphical techniques to describe characteristics of time series
  • Common use cases of the class of ARIMA models
  • Essential concepts required to appropriately apply the class of ARIMA models in practice, such as

    • Mathematical formulation of this class of model
    • Statistical assumptions of this class of model
    • Implementation of these models in Python using simulated and real-world time-series data
    • ARIMA model selection
    • Assumption testing and model evaluation
    • Forecasting
  • The advantages and disadvantages of the class of ARIMA models

This training course is for you because...

Time series data is ubiquitous, and it differs from cross-sectional data in that it has temporal dependence, which can be leveraged to forecast future values of the series. However, to analyze these characteristics and model them for forecasting require a different set of techniques. This course teaches these techniques and is designed for data scientists who are at the beginning of their journey of analyzing time series data and producing time series forecasts. The concepts and techniques discussed in this course form the part of the foundation for learning more advanced time series methods.

Prerequisites

  • Working knowledge of Python and R
  • Jupyter Notebook or Jupyterlab
  • Working knowledge of the classical linear regression model

Course Set-up

The course slides, datasets, and jupyter notebooks will be posted in this repo: https://github.com/jeffrey-yau/Pearson-TSA-Training-Beginner.git

Attendees should have the jupyter notebook or jupyter lab, and even anaconda installed. Anaconda’s distribution of Python comes with Jupyter notebook and hundreds of libraries useful for statistical and machine learning modeling and data analysis.

Anaconda: https://www.anaconda.com/products/individual

Package list: https://docs.anaconda.com/anaconda/packages/pkg-docs/

About your instructor

  • Jeffrey is the Chief Data Scientist at AllianceBernstein, a global investment firm managing over $500 billions. In this role, he is responsible for leading the data science team, partnering with investment professionals to create investment signals, and collaborating with sales and marketing teams to optimize sales. Graduated with a Ph.D. in economics from the University of Pennsylvania, he has also taught statistics, econometrics, and machine learning courses at UC Berkeley, Cornell, NYU, the University of Pennsylvania, and Virginia Tech. Previously, Jeffrey held data science and analytic leadership positions at Silicon Valley Data Science, Charles Schwab Corporation, and KPMG. Jeffrey also is a frequent speaker at leading data science conferences such as ODSC, Spark & AI Summit, and Strata.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Segment 1: Introduction to time series analysis (40 min)

  • 1.1 Introduction and welcome to the course
  • 1.2 Common use cases of time series analysis from different disciplines
  • 1.3 Common characteristics and patterns of time series
  • 1.4 The class of models to be covered today: A demo
  • Exercise 1
  • Break (10min)

Segment 2: Exploratory Time Series Data Analysis and ARMA Model Formulation (60 min)

  • 2.1 A brief discussion on the notion of stochastic processes, time series, stationarity, and basic terminology of time series analysis
  • 2.2 Exploratory Time Series Data Analysis
  • 2.3 Mathematical formulation of AR, MA, and ARMA models
  • 2.4 Lag (or backshift) operators
  • 2.5 Properties of the general Autoregressive model of order p (AR(p))
  • 2.6 Properties of the general Moving Average model of order q (MA(q))
  • Exercise 2
  • Break (10min)

Segment 3: ETSDA, ARIMA Model Formulation (60 min)

  • 3.1 Notion of non-stationarity
  • 3.2 Mathematical formulation of ARIMA models
  • 3.3 The Box-Jenkins Approach to ARIMA Modeling of non-stationary time series
  • Exercise 2
  • Break (10min)

Segment 4. ARIMA Modeling (60 min)

  • 4.1 Model Identification
  • 4.2 Model Diagnostic Checking
  • 4.3 Model performance evaluation (in-sample fit)
  • 4.4 Forecasting and forecast evaluation
  • 4.5 Incorporation of explanatory variables, its use cases, and its practical suggestions
  • Exercise 4
  • Break (10min)

Segment 5. Seasonal ARIMA Modeling (60 min)

  • 5.1 Understanding seasonality and examination of seasonal time series
  • 5.2 Mathematical formulation of Seasonal ARIMA (SARIMA) models
  • 5.3 Building a seasonal ARIMA model for forecasting
  • Exercise 4
  • Break (10min)

Segment 6. Closing Remarks: Practical suggestions and other topics (20 min)

  • 5.1 Model selection heuristics
  • 5.2 Course wrap-up and next steps, and where to go from here