Learning Pentaho Data Integration 8 CE - Third Edition
by Dan Keeley, Diethard Steiner, María Carina Roldán, Miguel Gaspar, Pablo Castagnaro, Paula Clemente, Paulo Alexandre de Oliveira Rodrigues Pires
Distributing rows
As said, when you split a stream, you can either copy or distribute the rows. Copying is about creating copies of the whole dataset and sending each of them to each output stream. Distributing means that the rows of the dataset are distributed among the destination steps. Those steps run in separate threads, so distribution is a way to implement parallel processing.
When you distribute, the destination steps receive the rows in a round-robin fashion. For example, if you have three target steps, as for example, the three calculators in the following screenshot the first row of data goes to the first target step, the second row goes to the second step, the third row goes to the third step, the fourth row goes to the fourth ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access