Dirty data refers to data that can suffer from all sorts of problems, including, but not limited to, things such as erroneous or conflicting entries, missing values, and outdated data. Tidy data is the opposite, data that is in a nice format, with no inconsistencies or other issues.
Dirty data can cause all sorts of problems. First, it makes consolidation of different data sources difficult or sometimes outright impossible. Second, many of the data points might not be usable. This can reduce the effective size of your data. You might be holding 5GB of data, but only ...