Chapter 47. Know Thy flatMap
Daniel Hinojosa
Job titles morph constantly. As in the medical community, where the focus may be broader or more specialized, some of us who were once just programmers are now filling other job titles. One of the newest specialized disciplines is data engineer. The data engineer shepherds in the data, building pipelines, filtering data, transforming it, and molding it into what they or others need to make real-time business decisions with stream processing.
Both the general programmer and data engineer must master the flatMap, one of the most important tools for any functional, capable language like our beloved Java, but also for big data frameworks and streaming libraries. flatMap, like its partners map and filter, is applicable for anything that is a “container of something”—for example, Stream<T> and CompletableFuture<T>. If you want to look beyond the standard library, there is also Observable<T> (RXJava) and Flux<T> (Project Reactor).
In Java, we will use Stream<T>. The idea for map is simple—take all elements of a stream or collection and apply a function to it:
Stream.of(1, 2, 3, 4).map(x -> x * 2).collect(Collectors.toList())
This produces:
[2, 4, 6, 8]
What happens if we do the following?
Stream.of(1, 2, 3, 4)
.map(x -> Stream.of(-x, x, x + 1))
.collect(Collectors.toList())
Unfortunately, we get a List of Stream pipelines:
[java.util.stream.ReferencePipeline$Head@3532ec19, ...