O'Reilly logo
live online training icon Live Online training

Functional Java for Big Data

Topic: Software Development
Simon Roberts

Functional programming helps you make the most of modern parallel hardware and is at the heart of many of the tools and libraries that are used for "big data" processing. Java 8 introduced the lambda syntax to support functional programming and it gives access to several big data toolkits, from its own Streams API through tools like Apache Spark and Kafka.

While it’s fairly easy to copy and paste workable code it's far more effective to learn how functional programming is used as a style, and how designs flow from it. This course is presented using Java 8’s syntax and will give you a strong, practical understanding of key tools of functional programming and how they support the big data approach. This training is heavy on understandability and practicality, and light on academic formality. We'll start by introducing the functional concepts presented in regular Java code, and build through the lambda syntax, through the essential design patterns of functional programming, and into some simple examples created in Apache Spark.

What you'll learn-and how you can apply it

  • Benefits and uses of the function as a first class language citizen:
  • Designing software using functions, rather than classes, as minimum units of code reuse
  • Benefits of immutable data and pure functions
  • The Lambda syntax
  • Handling checked exceptions in functional code
  • Using Java 8 streams
  • Running streams in parallel
  • Architectural consequences of parallel streams
  • Introduction to distributed data processing with Apache Spark

This training course is for you because...

  • You are a Java software engineer not yet using functional design, and you’d like to start benefiting from what this paradigm has to offer
  • You want to write Java programs that benefit from parallel or distributed hardware using Streams or one of the distributed big data platforms
  • You want to develop a core understanding of functional and big-data programming concepts
  • You might be a programmer in another language, wanting to get a solid understanding of how functional programming works--this course emphasizes understanding, and that understanding is not specific to any particular language.


You should be comfortable reading code written in the Java programming language, in particular, with the use of interfaces in that language You should understand basic software design concepts such as the separation of concerns

Materials, downloads, or Supplemental Content needed in advance:

  • Some practical exercises will be suggested during the course. You’re free to attempt these in any language you choose, but assistance and examples from the instructor will be in Java


  • Source code repository on github.com will be used to distribute examples and other materials created during the course

About your instructor

  • Simon started out working as a software engineer, specializing in industrial control systems, and had a sideline teaching for a local University in his then-home-town of Cambridge, England.

    In 1995 he joined Sun Microsystems, Inc. as a senior instructor and course developer. Simon spearheaded the introduction of Java training by Sun Microsystems in the U.K. in 1995. He developed the first Java certification exams for Sun before he moved to the U.S. in 1998.

    Since leaving Sun in 2004, Simon has developed and delivered training for clients around the world.

    Simon believes that training should have an immediate purpose and application, and that the most effective training is usually "on the job" mentoring, helping to remove the immediate roadblocks to productivity that so often plague workers in fast moving environments.


The timeframes are only estimates and may vary according to how the class is progressing

Day 1 (4 hours)

Segment 1 From an OO design pattern to a functional foundation (60 minutes)

  • In this segment we’ll investigate a familiar problem that has a solution that is generally well understood in OO design. We’ll investigate that solution and discover that it’s actually a functional solution. From there we’ll build the foundations of perhaps the most central perspective of functional design.
  • Break 10 minutes

Segment 2 Building more functional concepts and introducing the Lambda syntax (95 minutes)

  • In this segment we will expand our investigation of our core functional concept, looking at some more object oriented design considerations, and how they lead to functional solutions, and key functional design patterns.
  • Breaks at midpoint and end, 10 minutes each

Segment 3 Solidifying Java syntax: Functional Interfaces, Method References, and Generics (55 minutes)

  • In this segment we will shift gears and look at the syntax mechanisms that are provided by Java that support functional approaches. Although the syntax is of course specific to Java 8, discussions of the intent of the syntax will not be.

Day 2

Segment 4 The power of combinations and modifications (25 minutes)

  • In this segment we will investigate the enormous power that can arise from reusing functional code. As with other design styles, units of code in a functional design can be combined, and can be wrapped so their behavior is altered. And, perhaps more so than in other design styles, the effect of such combinations and modifications can be incredibly powerful.

Segment 5 The importance of pure functions (25 minutes)

  • This segment delves into another of functional programming’s core paradigms, one that’s often overlooked because it can seem inconvenient. That is the idea of a pure function. Programming with pure functions can seem a little mind-bending, particularly at first, but there are several benefits that can accrue from them, and even a limited application of the idea can provide many of those benefits in practical systems.
  • Break 10 minutes

Segment 6 Introducing Streams and the Monad (60 minutes)

  • This segment builds on the concepts learned to this point to dive into the use of Java 8’s Streams API.
  • Break 10 minutes

Segment 7 Using parallel streams for high performance data processing (50 minutes)

  • Break 10 minutes

Segment 8 Introducing Spark and the distributed approaches to big-data processing (50 minutes)

  • Setting up a simple Spark project with Java and Maven
  • Converting the Streams example to use Spark for parallel, distributed, processing
  • Understanding essential architectural considerations for distributed big-data systems