9.3. Error Outputs
Error outputs can obviously be used to improve reliability, but they also have an important part to play for scalability as well. From a reliability perspective, they are a critical feature for coping with bad data. An appropriately configured component will direct failing rows down the error output as opposed to the main output. These rows are now removed from the main Data Flow path and may then receive additional treatment and cleansing to enable them to be recovered and merged back into the main flow. They can be explicitly merged, such as with a Union transform, or implicitly through a second adapter directed at the final destination. Alternatively they could be discarded. Rows are rarely discarded totally; more often they will be logged and dealt with at a later point in time.
The capability to recover rows is perhaps the most useful course of action. If a data item is missing in the source extract but required in the final destination, the error flow path can be used to fix this. If the data item is available from a secondary system, then a lookup could be used. If the data item is not available elsewhere, then perhaps a default value can be used instead.
In other situations, the data may be out of range for the process or destination. If the data causes an integrity violation, then the failed data could be used to populate the constraining reference with new values and then the data itself could be successfully processed. If a data type conflict occurs, ...