Chapter 5. Cleaning Your Data

Who made this mess? Let’s clean it up! Cleaning your data should not be as hard as cleaning the room of a 5-year-old who loves paint and LEGO bricks (trust me, it’s not as easy as you think). For those of us who have spent enough time working with data, we know it can get messy, dirty, and downright unusable. Data is like a lot of things in life—it’s fragile and requires a lot of care and attention. I believe it shouldn’t be a hard task to provide that care and attention, and that’s primarily because Designer makes it so simple and straightforward.

Before we get started, please remember that each chapter in this book is additive. You will build on your knowledge of tools and techniques as you go through the book. You will learn things in this chapter that will help you in nearly every other chapter going forward. If you get to a new chapter and feel a bit overwhelmed, feel free to roll back and reread, think, and practice more. The goal, as I’ve stated, is to turn you into an absolute, bona fide Alteryx rock star. Let’s do this.

When we talk about having “clean data,” what do we mean, exactly? We are talking about cleaning up all the nuances in our data that don’t help us tell the story we are trying to tell. Usually, when I refer to “clean data,” I am referring to five specific factors:

  • We don’t have null values.

  • We don’t have missing values.

  • We don’t have duplicative data.

  • We don’t have incorrect or “bad” values (bad is subjective, of course). ...

Get Alteryx Designer: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.