Recipe 4 – transforming cell values

In Chapter 2, Analyzing and Fixing Data, we saw that OpenRefine can automatically change the contents of all cells in a column, such as trimming whitespace. In the previous recipe, we learned that clustering is another method to perform column-wide value changes. However, these operations are part of a more general mechanism for transforming cell contents. You can change the value of each cell in various complex ways. Although this looks a bit like Excel formulas, it is surprising to see how much can be done with just a little.

For instance, suppose you don't like the vertical bar as a separator in the Categories field and want to have a comma followed by a space instead. While this could be solved by first splitting ...

Get Using OpenRefine now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.