O'Reilly logo
live online training icon Live Online training

Stream Processing and Beyond with Apache Flink

Getting started with real-time, low-latency data processing

Topic: Data
Bowen Li

Time is value—and it’s missing in your legacy big data infrastructure. Businesses now must generate more value from their real-time data to offer better service and interact with their customers faster. This requires shifting from big data, from batch-oriented processing with high latency of hours or even days, to stream processing, where data is processed as it flows in with superlow latency in a scalable, fault-tolerant fashion. And Apache Flink is driving this trend, helping businesses grow and thrive at a faster pace through its streaming-first architecture and key features and capabilities.

Join expert Bowen Li to learn how to leverage Flink’s building blocks and develop Flink applications. You’ll understand how Flink works and get hands-on experience using it to process streaming data in real time, with lessons you can immediately apply in your own work.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • How the Apache Flink community works
  • Stream processing and Flink fundamental concepts
  • What makes Flink fast, reliable, and scalable
  • How to architect a native real-time data platform
  • What the Apache Flink community has been actively working on beyond stream processing, in fields like machine learning, AI, and serverless computations

And you’ll be able to:

  • Set up a local Flink development and testing environment
  • Run an end-to-end application with Flink SQL and Kafka
  • Write Flink applications using the DataStream API with Java and Flink SQL

This training course is for you because...

  • You’re a data engineer or software engineer who’s eager to build real-time data pipelines and platforms.
  • You’re a product manager or business manager who wants to understand the use cases and functionalities offered by Flink and stream processing.
  • You want to become an expert in stream processing, real-time ETL, and Apache Flink.

Prerequisites

  • A basic understanding of Java programming and SQL
  • Familiarity with big data tools and platforms like Hadoop

Recommended preparation:

Recommended follow-up:

About your instructor

  • Bowen is a committer of Apache Flink and senior engineer at Alibaba. He has been working on Flink for over 3 years, with extended experience on developing and operating Flink in Alibaba at an unprecedented scale.

    Besides committing code and reviewing designs, Bowen is a frequent speaker of Flink at conferences and events, evangelizing Flink and stream processing, to make the world a little bit more real-time data driven at a time.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction to Apache Flink and stream processing (45 minutes)

  • Presentation: Introduction to Flink; use cases; example architectures and pipelines based on Flink; event time and watermarks; advanced windowing
  • Q&A

Break (5 minutes)

Inside Flink (40 minutes)

  • Presentation: Flink state and state backend; exactly-once and at-least-once; checkpointing
  • Hands-on exercise: Run and operate a local Flink cluster
  • Q&A

Break (5 minutes)

Processing stream data with the Flink DataStream API and Flink SQL (40 minutes)

  • Presentation: Connectors; deployment; the DataStream API; Flink SQL and stream-table duality
  • Hands-on exercises: Build a streaming application with the Flink DataStream API; build a streaming application with Flink SQL and Kafka
  • Q&A

Break (5 minutes)

Beyond stream processing: Unified data engines, machine learning/AI, and more (40 minutes)

  • Presentation: Flink batch; Flink + a traditional data warehouse; Flink + machine learning/deeping learning; Flink + serverless
  • Q&A