3 DATA VISUALIZATION

In this chapter, we describe a set of plots that can be used to explore the multi‐dimensional nature of a dataset. We present basic plots (bar charts, line graphs, and scatter plots), distribution plots (boxplots and histograms), and different enhancements that expand the capabilities of these plots to visualize more information. We focus on how the different visualizations and operations can support machine learning tasks, from supervised (prediction, classification, and time series forecasting) to unsupervised tasks, and provide some guidelines on specific visualizations to use with each machine learning task. We also describe the advantages of interactive visualization over static plots. The chapter concludes with a presentation of specialized plots that are suitable for data with special structure (hierarchical and geographical).

Data visualization in JMP: All the methods discussed in this chapter are available in the standard version of JMP.

3.1 INTRODUCTION1

The popular saying “a picture is worth a thousand words” refers to the ability to condense diffused verbal information into a compact and quickly understood graphical image. In the case of numbers, data visualization and numerical summarization provide us with both a powerful tool to explore data and an effective way to present results (Few, 2012).

Where do visualization techniques fit into the machine learning as described so far? They are primarily used in the preprocessing portion of ...

Get Machine Learning for Business Analytics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.