Summary

This chapter has introduced recipes for advanced data operations. We have looked at multi-valued cells in different ways: when they had values of equal importance, we split them across several rows; when they had a different function, we split them across columns. We have also seen that OpenRefine has a special mode for working with multi-valued cells spread over different rows called records mode. In records mode, multiple rows that belong to the same object can be treated as one, giving you powerful search and manipulation options.

We also introduced you to clustering, which is really helpful if some of your cell values need to be consistent but are actually a bit messy. You can even go further and define your own transformation operations ...

Get Using OpenRefine now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.