Chapter 25. Splitting Data Fields

One of the most common actions you will take when preparing string data for analysis is to split a string field into its subparts. Splitting was briefly mentioned in Chapter 9, which covered the basics of working with strings if you need to refresh your memory about the data type. Splitting is required for many reasons, such as operational systems picking up data and outputting unique IDs for each record or squeezing records together to fit them into a specified database table. The human brain is fantastic at spotting patterns in data (that’s why we create visual analytics, after all), so you can often spot the need to split data fields (columns) by just looking at the data set.

For example, in Figure 25-1, we can see that we probably need to split the Product Code field on the left into three separate columns (on the right) in order to help us analyze this data set.

Result of splitting Product Code field
Figure 25-1. Result of splitting Product Code field

Basic Splits

Splitting data in most data tools is very easy; Prep Builder is no different. Simply choose the data field you want to split, click the ellipsis in the top right of the field, select Split Values, and then click Automatic Split. Prep Builder will split up the field using what it believes to be the most appropriate logic (Figure 25-2).

Figure 25-2. Selecting Automatic Split from a data field’s menu

In this case, the ...

Get Tableau Prep: Up & Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.