Validating and Transforming Data
Problem
You need to make sure the data values contained in a file are legal.
Solution
Check them, possibly rewriting them into a more suitable format.
Discussion
Earlier recipes in this chapter show how to work with the structural characteristics of files, by reading lines and busting them up into separate columns. It’s important to be able to do that, but sometimes you need to work with the data content of a file, not just its structure:
It’s often a good idea to validate data values to make sure they’re legal for the column types into which you’re storing them. For example, you can make sure that values intended for
INT,DATE, andENUMcolumns are integers, dates inCCYY-MM-DDformat, and legal enumeration values.Data values may need reformatting. Rewriting dates from one format to another is especially common. For example, if you’re importing a FileMaker Pro file into MySQL, you’ll likely need to convert dates from
MM-DD-YYformat to ISO format. If you’re going in the other direction, from MySQL to FileMaker Pro, you’ll need to perform the inverse date transformation, as well as splitDATETIMEandTIMESTAMPcolumns into separate date and time columns.It may be necessary to recognize special values in the file. It’s common to represent
NULLwith a value that does not otherwise occur in the file, such as-1,Unknown, orN/A. If you don’t want those values to be imported literally, you’ll need to recognize and handle them specially.
This section begins ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access