March 2019
Beginner to intermediate
778 pages
34h 20m
English
Element-wise transforms operate on individual elements within PCollection. This concept can loosely be compared to the mapping and reducing operations of a MapReduce. Developers execute these transformations by invoking a ParDo operation provided by the Cloud Dataflow SDK, which is the core operation for parallel processing.
ParDo accepts a DoFn object, for which developers provide an implementation. DoFn itself accepts a PCollection input, acts on the elements of that input, and returns a new PCollection output. As a simple example, in order to transform a PCollection of strings into a PCollection of lower-case words, we could define DoFn as follows:
static class FlatMapStringsToWords extends DoFn<String, String> ...
Read now
Unlock full access