Scalable Data Streaming with Amazon Kinesis

Book description

Explore Kinesis managed services such as Kinesis Data Streams, Kinesis Data Analytics, Kinesis Data Firehose, and Kinesis Video Streams with the help of practical use cases

Key Features

  • Get well versed with the capabilities of Amazon Kinesis
  • Explore the monitoring, scaling, security, and deployment patterns of various Amazon Kinesis services
  • Learn how other Amazon Web Services and third-party applications such as Splunk can be used as destinations for Kinesis data

Book Description

Amazon Kinesis is a collection of secure, serverless, durable, and highly available purpose-built data streaming services. This data streaming service provides APIs and client SDKs that enable you to produce and consume data at scale.

Scalable Data Streaming with Amazon Kinesis begins with a quick overview of the core concepts of data streams, along with the essentials of the AWS Kinesis landscape. You'll then explore the requirements of the use case shown through the book to help you get started and cover the key pain points encountered in the data stream life cycle. As you advance, you'll get to grips with the architectural components of Kinesis, understand how they are configured to build data pipelines, and delve into the applications that connect to them for consumption and processing. You'll also build a Kinesis data pipeline from scratch and learn how to implement and apply practical solutions. Moving on, you'll learn how to configure Kinesis on a cloud platform. Finally, you'll learn how other AWS services can be integrated into Kinesis. These services include Redshift, Dynamo Database, AWS S3, Elastic Search, and third-party applications such as Splunk.

By the end of this AWS book, you'll be able to build and deploy your own Kinesis data pipelines with Kinesis Data Streams (KDS), Kinesis Data Firehose (KFH), Kinesis Video Streams (KVS), and Kinesis Data Analytics (KDA).

What you will learn

  • Get to grips with data streams, decoupled design, and real-time stream processing
  • Understand the properties of KFH that differentiate it from other Kinesis services
  • Monitor and scale KDS using CloudWatch metrics
  • Secure KDA with identity and access management (IAM)
  • Deploy KVS as infrastructure as code (IaC)
  • Integrate services such as Redshift, Dynamo Database, and Splunk into Kinesis

Who this book is for

This book is for solutions architects, developers, system administrators, data engineers, and data scientists looking to evaluate and choose the most performant, secure, scalable, and cost-effective data streaming technology to overcome their data ingestion and processing challenges on AWS. Prior knowledge of cloud architectures on AWS, data streaming technologies, and architectures is expected.

Table of contents

  1. Scalable Data Streaming with Amazon Kinesis
  2. Contributors
  3. About the authors
  4. About the reviewers
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  6. Section 1: Introduction to Data Streaming and Amazon Kinesis
  7. Chapter 1: What Are Data Streams?
    1. Introducing data streams
      1. Sources of data
    2. The value of real-time data in analytics
    3. Decoupling systems
    4. Challenges associated with distributed systems
      1. Transactions per second
      2. Scaling
      3. Latency
      4. Fault tolerance/high availability
    5. Overview of messaging concepts
      1. Overview of core messaging components
      2. Messaging concepts
    6. Examples of data streaming
      1. Application log processing
      2. Internet of Things
      3. Real-time recommendations
      4. Video streams
    7. Summary
    8. Further reading
  8. Chapter 2: Messaging and Data Streaming in AWS
    1. Amazon Kinesis Data Streams (KDS)
      1. Encryption, authentication, and authorization
      2. Producing and consuming records
      3. Data delivery guarantees
      4. Integration with other AWS services
      5. Monitoring
    2. Amazon Kinesis Data Firehose (KDF)
      1. Encryption, authentication, and authorization
      2. Monitoring
      3. Producers
      4. Delivery destinations
      5. Transformations
    3. Amazon Kinesis Data Analytics (KDA)
      1. Amazon KDA for SQL
      2. Amazon Kinesis Data Analytics for Apache Flink (KDA Flink)
    4. Amazon Kinesis Video Streams (KVS)
    5. Amazon Simple Queue Service (SQS)
    6. Amazon Simple Notification Service (SNS)
      1. Amazon SNS integrations with other AWS services
      2. Encryption at rest
    7. Amazon MQ for Apache ActiveMQ
    8. IoT Core
      1. Device software
      2. Control services
      3. Analytics services
    9. Amazon Managed Streaming for Apache Kafka (MSK)
      1. Apache Kafka
      2. Amazon MSK
    10. Amazon EventBridge
    11. Service comparison summary
    12. Summary
  9. Chapter 3: The SmartCity Bike-Sharing Service
    1. The mission for sustainable transportation
    2. SmartCity new mobile features
      1. SmartCity data pipeline
      2. SmartCity data lake
      3. SmartCity operations and analytics dashboard
      4. SmartCity video
    3. The AWS Well-Architected Framework
    4. Summary
    5. Further reading
  10. Section 2: Deep Dive into Kinesis
  11. Chapter 4: Kinesis Data Streams
    1. Technical requirements
    2. Discovering Amazon Kinesis Data Streams
      1. Creating streams and shards
    3. Creating a stream producer application
    4. Creating a stream consumer application
    5. Data pipelines with Amazon Kinesis Data Streams
      1. Data pipeline design (simple)
      2. Data pipeline design (intermediate)
      3. Data pipeline design (full design)
      4. Designing for scalable and reliable analytics pipelines
      5. Monitoring and scaling with Amazon Kinesis Data Streams
      6. X-Ray tracing with Amazon Kinesis Data Streams
      7. Scaling up with Amazon Kinesis Data Streams
      8. Securing Amazon Kinesis Data Streams
      9. Implementing least-privilege access
    6. Summary
    7. Further reading
  12. Chapter 5: Kinesis Firehose
    1. Technical requirements
      1. Setting up the AWS account
      2. Using a local development environment
      3. Using an AWS Cloud9 development environment
      4. Code examples
    2. Discovering Amazon Kinesis Firehose
      1. Understanding KDF delivery streams
    3. Understanding encryption in KDF
    4. Using data transformation in KDF with a Lambda function
    5. Understanding delivery stream destinations
      1. Amazon S3
      2. Amazon Redshift
      3. Amazon Elasticsearch Service
      4. Splunk destination
      5. HTTP endpoint destination
    6. Understanding data format conversion in KDF
      1. Deserialization
      2. Schema
      3. Serializer
      4. Data format conversion errors
    7. Understanding monitoring in KDF
    8. Use-case example – Bikeshare station data pipeline with KDF
      1. Steps to recreate the example
    9. Summary
    10. Further reading
  13. Chapter 6: Kinesis Data Analytics
    1. Technical requirements
      1. AWS account setup
      2. AWS CDK
      3. Java and Java IDE
      4. Code examples
    2. Discovering Amazon KDA
    3. Working on SmartCity bike share analytics use cases
    4. Creating operational insights using SQL Engine
      1. Core concepts and capabilities
    5. Creating operational insights using Apache Flink
      1. Options for running Flink applications in AWS Cloud
      2. Flink applications on KDA
    6. Building bike ride analytic applications
      1. Setting up a producer application
      2. Building a KDA SQL application
      3. Building a KDA Flink application
    7. Monitoring KDA applications
    8. Summary
    9. Further reading
      1. Blogs
      2. Workshops
  14. Chapter 7: Amazon Kinesis Video Streams
    1. Technical requirements
      1. AWS account setup
      2. Using a local development environment
      3. Code examples
    2. Understanding video fundamentals
      1. Containers
      2. Codecs
    3. Discovering Amazon Kinesis video streams WebRTC
      1. Core concepts and connection patterns
      2. Creating a signaling channel
      3. Establishing a connection
    4. Discovering Amazon KVS
      1. Key components of KVS
      2. Stream
      3. Kinesis producer
      4. Consuming
      5. Creating a stream
      6. Producing
      7. Integration with Rekognition
    5. Building video-enabled applications with KVS
    6. Summary
    7. Further reading
  15. Section 3: Integrations
  16. Chapter 8: Kinesis Integrations
    1. Technical requirements
      1. AWS account setup
      2. AWS CLI
      3. Kinesis Data Generator
      4. Code examples
    2. Amazon services that can produce data to send to Kinesis
      1. Amazon Connect
      2. Amazon Aurora database activity
      3. DynamoDB activity
      4. Processing Kinesis data with Apache Spark
      5. Amazon services that consume data from Kinesis
      6. Serverless data lake
    3. Amazon services that transform Kinesis data
      1. Routing events with EventBridge
    4. Third-party integrations with Kinesis
      1. Splunk
    5. Summary
    6. Further reading
    7. Why subscribe?
  17. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Leave a review - let other readers know what you think

Product information

  • Title: Scalable Data Streaming with Amazon Kinesis
  • Author(s): Tarik Makota, Brian Maguire, Danny Gagne, Rajeev Chakrabarti
  • Release date: March 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781800565401