Fundamentals of Apache Flink

by Sridhar Alla

Released October 2019

Publisher(s): Packt Publishing

ISBN: 9781789539332

Start your free trial

Video description

Have you heard of Apache Flink, but don't know how to use it to get on top of big data? Have you used Flink, but want to learn how to set it up and use it properly? Either way, this course is for you.

This course first introduces Flink concepts and terminology, and then moves on to building a Flink instance, collecting data, and using that data to generate output that can be used as processed data input into other systems. You will also use the Flink APIs to process data in batch and streaming modes.

By the end of the course, you will be capable of using the Apache Flink ecosystem to achieve complex tasks such as event processing and machine learning.

What You Will Learn

Build your own Flink development environment on a Linux server
Monitor your stream processing in real-time using the Flink UI
Organize your data comprehensively using data processing pipelines
Build end-to-end, real-time analytics projects
Design a distributed Flink environment to efficiently process, transform, and aggregate your data

Audience

This course targets big data developers keen to process batch and real-time data on distributed systems. A basic knowledge of Hadoop and big data is assumed, and a reasonable knowledge of Java or Scala is expected.

About The Author

Sridhar Alla: Sridhar Alla is the co-founder and CTO of Blue Whale Consulting and is expert at helping companies (big and small) define their vision for systems and capabilities that will allow them to establish a strategic execution plan to deal with the ever-growing data collected to support analytics and product teams. He has very experienced at dealing with all aspects of data collection, security, governance, and processing as part of end-to-end big data analytics and machine learning initiatives (including predictive modeling, deep learning, and ML automation).

Sridhar is a published book author and an avid presenter at numerous conferences, including Strata, Hadoop World, and Spark Summit. He also has several patents filed with the US PTO on large-scale computing and distributed systems.

He has over 18 years' experience writing code in Scala, Java, C, C++, Python, R, and Go, and has extensive hands-on knowledge of Spark, Flink, TensorFlow, Keras, Hadoop, Cassandra, HBase, MongoDB, Riak, Redis, Zeppelin, Mesos, Docker, Kafka, ElasticSearch, Solr, H2O, machine learning, text analytics, distributed computing, and high-performance computing.

Sridhar lives with his wife and daughter in New Jersey and in his spare time loves blogging and coaching organizations on next-generation advancements in technology and their alignment with business goals.