The regex validation and cleansing design pattern

This design pattern deals with using the regex functions to validate the data. The regex functions can be used to validate the data to match a specific length or pattern and to cleanse the invalid data.


This design pattern discusses ways to use a regex function to identify and clean data that has invalid field lengths. The pattern also identifies all the occurrences of values with the specified date format from within the data and removes the invalid values that do not comply with the format specified.


Identifying string data with incorrect length is one of the quickest ways to understand if the data is accurate. Often we will need this string length parameter to judge the data ...

Get Pig Design Patterns now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.