The pattern-matching pattern
This section describes the pattern-matching design pattern in which we use Pig scripts to match numeric and text patterns, to ascertain if the data is coherently relative to itself and thus, get a measure of data quality.
In the enterprise context, examining the data for coherence comes after the data has been ingested and its completeness and correctness has been ascertained. The values of a given attribute can come in different shapes and sizes. This is especially true for fields requiring human input, where the values are entered according to the whims of the user. Assuming a column representing the phone number field is coherent, it can be said that all the values represent valid phone numbers since they ...