Chapter 39. Scaffolding Your Data

Scaffolding is a term that you may not come across often, but the challenges that it resolves are increasingly found in modern data problems.

What Is Scaffolding?

Scaffolding is the process of filling missing rows of data within a data set to assist analysis. A data set may appear complete—with no nulls and a record for each individual entity—but still not be suited for the analysis you wish to conduct. Consider a mobile phone operator that wishes to analyze its monthly revenue from contracted customers. Figure 39-1 shows the data the operator is likely to have.

Data set requiring scaffolding to assist analysis
Figure 39-1. Data set requiring scaffolding to assist analysis

As you can see, the operator has a record of the customer, the contract start date, the contract length, and the monthly price for the service. However, there is no date to determine the value we’re seeking to analyze, monthly revenue. The only dates to use for analysis are the contract start date and end date. If the contract is for two years, we would need 24 records to gain full insight. As Figure 39-2 demonstrates, there is only a single month of revenue per Customer ID, and we can’t see the revenue being collected over time.

Resulting visualization from data set in #data_set_requiring_scaffolding_to_assis
Figure 39-2. Resulting visualization from data set in Figure 39-1

Scaffolding is the answer to ...

Get Tableau Prep: Up & Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.