Chapter 4. Shaping Data for Use with Tableau
The next thing I wish I knew when I first started using Tableau is that there is an optimal way to shape data for use with the software. I’ll never forget the day I was introduced to Tableau. The boss walked in and asked three of us in the office to try out this new tool she had heard of for creating data visualizations. The first thing every one of us tried to do is connect to an existing Excel report and re-create it in Tableau. After all, this was supposed to be intuitive—perhaps even magical—software, right? We quickly found out that nothing worked as we expected, we couldn’t figure out how to make a single chart, and we had to fight the temptation to immediately revert back to our familiar Excel experience.
It’s fun to look back, and this now seems like a simple problem to solve, but the scenario I experienced my first time with Tableau is not uncommon. In fact, it’s both the most common Tableau adoption scenario that I come across—and the most difficult: first-time users connecting to an existing Excel report without any consideration to the format of the data.
Most existing Excel reports are not set up to work well with Tableau, and if this is the first data source that a first-time user attempts to work with, they are setting themselves up to fail. But don’t despair—I can personally attest to what it’s like to start using Tableau without any data or visualization software experience. I’ve always said that there are three or four key things to know when getting started with Tableau, and sometimes you just need somebody to share them with you so you can connect the dots and get started.
Shaping Data for Use with Tableau
This topic is easiest to illustrate, so take a look at the first image, which is meant to be similar to a typical report in Excel:
There is a title along the top, a column header for each quarter, and a row for each KPI (Sales, Profit, and Orders). In addition, there is a total for each row on the righthand side of the table.
The format of this report poses several problems for Tableau which, upon connecting, will try to interpret the data source, classify the fields, and set up your workspace:
-
There is a title in the first row. The first two rows are critical for Tableau to interpret the data source, so we’ve immediately gotten off on the wrong foot.
-
The column headers are quarters, which will cause Tableau to create a field for each quarter, when in fact the quarters should all be consolidated into one field for date/quarter.
-
The KPIs are running down the first column so, by default, Tableau will not interpret these KPIs as unique fields.
-
There is a total in the right column. As Tableau totals fields for you, not only is this unnecessary, it will likely lead to double-counting.
The ideal format for Tableau looks like this:
Each column now represents a unique field, so the layout is vertical instead of horizontal. The title and totals have also been removed.
With the data in this shape, Tableau will be able to look at the first row to determine the fields and the second row to classify the data (i.e., type; discrete versus continuous; dimension versus measure). We will discuss the ways Tableau classifies data in the next few chapters.
As one additional tip, if your dataset includes a date field that is not in a traditional date format (as we’ve shown here with quarters), I recommend adding a column that looks like an actual date. In this case, I’ve added a column for quarter as date, and chosen the first date in each quarter as the entries:
Note
Note that I’ve put the dates in my local (i.e., US) format, but this tip is also true for Tableau users outside of the United States using varying date formats. Your local version of Tableau should recognize the date format that you’re used to.
Dates are a special data type in Tableau and by having dates in a date format that the software recognizes, the full functionality of date fields is unlocked.
Finally, if data reshaping is required for you to work with a dataset in Tableau, you can reshape it prior to connecting—which is my personal preference—or use Tableau’s data interpreter and data pivot tools when you connect. Regardless of the method you choose, putting some thought into the shape of your data will help you get off to a strong start with your analyses in Tableau.
Get Practical Tableau now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.