O'Reilly logo
live online training icon Live Online training

Introduction to Interactive Data Visualization with Python

A hands-on guide to the Plotly library

Noureddin Sadawi

Data visualization is a powerful tool to communicate information “a picture is worth a thousand words”. In the current day and age, huge amounts of complex data are produced in various areas such as health, business, engineering and science. The increase rate of human activity on social media is astonishing. Therefore, appropriate data visualization is an important method to not only provide visual summaries and interpretation, but also to improve understanding, decision making as well as communication.

As a computer programming language, Python is easy to learn even for people with no previous computer programming experience (it is a great first language). One of the powerful aspects of Python is its data visualization and plotting libraries. As Stuart Card said: “Visualization is really about external cognition, that is, how resources outside the mind can be used to boost the cognitive capabilities of the mind”

Plots can be static (i.e. viewer cannot interact with them by zooming in or out, selecting certain areas and so on) or interactive (where the viewer can click, zoom in and out, see specific points/areas/values etc). Interactive plots are powerful as not only the viewer can interact with them, but they can also be easily embedded in html pages.

By learning how to generate interactive plots in Python, you will have the confidence to write your own code and produce the exact plot you intend to have. This means you no longer have to rely on existing platforms that have a limited number of ways to customize plots and visualizations.

This course is an introduction to using Plotly, an open source and freely available interactive data visualization package. The course goes through how to generate and customize several different plot types and techniques. The trainee will receive professional code developed by the trainer.

What you'll learn-and how you can apply it

  • Understand the importance, power and impact of data visualization
  • Learn how to select the right plot type for your purposes
  • Learn how to produce various types of interactive plots in Python
  • Learn how to embed Python generated interactive plots into html pages so you can publish them online
  • Learn how to customize interactive plots so you no longer rely on existing platforms

This training course is for you because...

  • As a professional, you wish to make an impact on others by generating well-designed interactive data visualizations
  • You would like to learn how communicate a piece of information (or an idea) to others via a plot that has a friendly look and feel
  • You would like to be proficient at customizing Python interactive plots and be able to making effective visualizations
  • Strengthen your Python skills especially when it comes to data handling and visualizations
  • Learn how to simplify complex data by helping your target users to see and feel it rather than read it

Prerequisites

  • Familiarity with Python - Should be able to launch and use Python in your favourite IDE (Spyder is recommended) and Jupyter (you know how to create and run Jupyter notebooks)

Course Set-up

  • Any operating system is fine
  • Python 3.5 or above

Recommended Preparation

Recommended Follow-up

About your instructor

  • Dr. Noureddin Sadawi is a consultant in machine learning and data science. He has several years’ experience in various areas involving data manipulation and analysis. He received his PhD from the University of Birmingham, United Kingdom. During his PhD he developed a technique to extract precise information from bitmap images of chemical structure diagrams. He developed a tool called MolRec and used it to participate in evaluation contests at two international events - TREC2011 and CLEF2012 - and won both of them.

    Noureddin is an avid scientific software researcher and developer who has a passion for learning and teaching new technologies. He has been involved in several projects spanning a variety of fields such as bioinformatics, drug discovery, omics data analysis and much more. He has taught at multiple universities in the UK and has worked as a software engineer in different roles. One of his latest positions was a research associate at the highly respected Imperial College London where he contributed significantly to the PhenoMeNal project (a project that heavily uses docker). Currently, he is a research fellow at the department of computer science, Brunel University – London where he developed deep learning techniques for the analysis of human gesture data.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Part 1: An Introduction, Overview and Installation of Plotly (60 minutes)

  • Install the Plotly library and test installation
  • An overview of why interactive visualization is powerful
  • An overview of the available plot types in Plotly and where to find more information
  • Line and Area plots in Plotly (Includes plotting multiple lines and plot configuration)

Q&A (15 minutes)

Break (10 minutes)

Part 2: Styling Plots, Scatter Plots, Bar, Pie, Bubble and Gantt Charts (60 minutes)

  • More on styling and customizing plots
  • Scatter plots in Plotly (includes mouse hover control and adding color dimension)
  • Bar charts in Plotly (includes grouped and stacked bar charts, mouse hover and more)
  • Pie charts in Plotly (includes controlling colors and adding custom labels. Also includes donut charts and pulling sectors out from the center)
  • Bubble charts (includes setting marker size and color and more)
  • Gantt charts (includes how to group tasks together and more)

Q&A (15 minutes)

Break (10 minutes)

Part 3: Statistical Plots (60 minutes)

  • Error Bars (Symmetric Asymmetric Error Bars)
  • Box Plots (includes box plot styling mean & standard deviation)
  • Scatter Matrix (includes how to style a scatter matrix)
  • Histograms and 2D Histograms (the number of bins, multi and overlaid histograms)
  • Distribution plots (plotting distributions and curves of single or multiple variables)
  • Violin Plots (includes combining violin plots with box plots and original data points, also grouped and split violin plots)

Q&A (15 minutes)

Break (10 minutes)