AWS Certified Data Analytics Specialty (2023) Hands-on

Video description

In this course, you will learn streaming massive data with AWS Kinesis; queuing messages with Simple Queue Service (SQS); wrangling the explosion data from the Internet of Things (IOT); transitioning from small to big data with the AWS Database Migration Service (DMS); storing massive data lakes with the Simple Storage Service (S3); optimizing transactional queries with DynamoDB; tying your big data systems together with AWS Lambda; making unstructured data query-able with AWS Glue, Glue ETL, Glue DataBrew, Glue Studio, and Lake Formation; processing data at an unlimited scale with Elastic MapReduce; applying neural networks at massive scale with deep learning, MXNet, and TensorFlow; applying advanced machine learning algorithms at scale with Amazon SageMaker; analyzing streaming data in real time with Kinesis Analytics; searching and analyzing petabyte-scale data with Amazon OpenSearch (formerly Elasticsearch) Service; querying S3 data lakes with Amazon Athena; hosting massive-scale data warehouses with Redshift and Redshift Spectrum; integrating smaller data with your big data using the Relational Database Service (RDS) and Aurora; visualizing your data interactively with QuickSight; and finally, keeping your data secure with encryption, KMS, HSM, IAM, Cognito, STS, and more.

By the end of this course, you will be well-versed in the essential concepts and major domains necessary to pass the AWS DAS-C01 exam.

What You Will Learn

  • Store big data with S3 and DynamoDB in a scalable, secure manner
  • Move and transform massive data streams with Amazon Kinesis
  • Use the Hadoop ecosystem with AWS using Elastic MapReduce
  • Discover various methods to analyze big data
  • Visualize big data in the cloud using AWS QuickSight
  • Keep your data secure with encryption, KMS, HSM, IAM, Cognito, and STS

Audience

This course is for experienced technologists seeking certification in big data technologies through Amazon Web Services. If you are looking to achieve this certification, it is recommended to have associate-level certification first.

About The Authors

Frank Kane: Frank Kane has spent nine years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers all the time. He holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology and teaches others about big data analysis.

Stéphane Maarek: Stéphane Maarek is a solutions architect, consultant, and software developer who has a particular interest in all things related to big data and analytics. He is also a bestseller instructor on Udemy for his courses on Apache Kafka, Apache NiFi, and AWS Lambda. He loves Apache Kafka and regularly contributes to the Apache Kafka project.

Stéphane has also written a guest blog post that was featured on the Confluent website, the company behind Apache Kafka. He is also an AWS Certified Solutions Architect and has many years of experience with technologies such as Apache Kafka, Apache NiFi, Apache Spark, Hadoop, PostgreSQL, Tableau, Spotfire, Docker, Ansible, and more.

Table of contents

  1. Chapter 1 : Introduction
    1. Course Overview
    2. Introducing our Hands-On Case Study: Cadabra.com
    3. Cost of the Course + AWS (Amazon Web Services) Budget Setup
  2. Chapter 2 : Domain 1: Collection
    1. Collection Section Introduction
    2. Kinesis Data Streams Overview
    3. Kinesis Producers
    4. Kinesis Consumers
    5. Kinesis Data Streams - Hands On
    6. Kinesis Enhanced Fan Out
    7. Kinesis Scaling
    8. Kinesis - Handling Duplicate Records
    9. Kinesis Security
    10. Kinesis Data Firehose
    11. CloudWatch Subscription Filters with Kinesis
    12. (Exercise) Kinesis Firehose, Part 1
    13. (Exercise) Kinesis Firehose, Part 2
    14. (Exercise) Kinesis Firehose, Part 3
    15. (Exercise) Kinesis Data Streams
    16. SQS Overview
    17. Kinesis Data Streams Versus SQS
    18. Database Migration Service (DMS)
    19. Direct Connect
    20. Snow Family
    21. MSK: Managed Streaming for Apache Kafka
    22. MSK Connect
    23. MSK Serverless
    24. Kinesis vs MSK
  3. Chapter 3 : Domain 2: Storage
    1. S3 Overview
    2. S3 Hands-On
    3. S3 Security: Bucket Policy
    4. S3 Security: Bucket Policy Hands-On
    5. S3 Versioning
    6. S3 Versioning - Hands On
    7. S3 Replication
    8. S3 Replication Notes
    9. S3 Replication – Hands-On
    10. S3 Storage Classes Overview
    11. S3 Storage Classes Hands-On
    12. S3 Lifecycle Rules (with S3 Analytics)
    13. S3 Lifecycle Rules – Hands-On
    14. S3 Event Notifications
    15. S3 Event Notifications – Hands-On
    16. S3 Performance
    17. S3 Select and Glacier Select
    18. S3 Encryption
    19. S3 Encryption – Hands-On
    20. S3 Default Encryption
    21. S3 Access Points
    22. S3 Object Lambda
    23. DynamoDB Overview
    24. DynamoDB Basics - Hands-On
    25. DynamoDB in Big Data
    26. DynamoDB RCU and WCU - Throughput
    27. DynamoDB RCU and WCU – Hands-On
    28. DynamoDB Basic APIs
    29. DynamoDB Basic APIs – Hands-On
    30. DynamoDB Indexes (GSI + LSI)
    31. DynamoDB Indexes (GSI + LSI) – Hands-On
    32. DynamoDB PartiQL
    33. DynamoDB DAX
    34. DynamoDB DAX - Hands-On
    35. DynamoDB Streams
    36. DynamoDB Streams – Hands-On
    37. DynamoDB TTL
    38. DynamoDB Patterns with S3
    39. DynamoDB Security
    40. (Exercise) DynamoDB
    41. ElastiCache Overview
  4. Chapter 4 : Domain 3: Processing
    1. Section Introduction: Processing
    2. What Is AWS Lambda?
    3. Lambda Integration - Part 1
    4. Lambda Integration - Part 2
    5. Lambda Costs, Promises, and Anti-Patterns
    6. (Exercise) AWS Lambda
    7. What Is Glue? + Partitioning Your Data Lake
    8. Glue, Hive, and ETL
    9. Modifying the Glue Data Catalog from ETL Scripts
    10. Glue ETL: Developer Endpoints, Running ETL Jobs with Bookmarks
    11. Glue Costs and Anti-Patterns
    12. AWS Glue Studio
    13. AWS Glue Data Quality
    14. AWS Glue DataBrew
    15. AWS Lake Formation
    16. AWS Lake Security
    17. Elastic MapReduce (EMR) Architecture and Usage
    18. EMR, AWS integration, and Storage
    19. EMR Promises; Introduction to Hadoop
    20. EMR Serverless, EMR, and EKS
    21. Introduction to Apache Spark
    22. Spark Integration with Kinesis and Redshift
    23. Spark integration with Athena
    24. Hive on EMR
    25. Pig on EMR
    26. HBase on EMR
    27. Presto on EMR
    28. Zeppelin and EMR Notebooks
    29. Hue, Splunk, and Flume
    30. S3DistCP and Other Services
    31. EMR Security and Instance Types
    32. (Exercise) Elastic MapReduce, Part 1
    33. (Exercise) Elastic MapReduce, Part 2
    34. AWS Data Pipeline
    35. AWS Step Functions
  5. Chapter 5 : Domain 4: Analysis
    1. Section Introduction: Analysis
    2. Introduction to Kinesis Analytics
    3. Kinesis Analytics Costs; RANDOM_CUT_FOREST
    4. (Exercise) Kinesis Analytics, Part 1
    5. (Exercise) Kinesis Analytics, Part 2
    6. (Exercise) Kinesis Analytics, Part 3
    7. (Exercise) Kinesis Analytics, Part 4
    8. Introduction to OpenSearch (formerly Elasticsearch)
    9. Amazon OpenSearch Service
    10. OpenSearch Index Management and Designing for Stability
    11. Amazon OpenSearch Service Performance
    12. Amazon OpenSearch Serverless
    13. (Exercise) Amazon OpenSearch Service
    14. Introduction to Athena
    15. Athena and Glue, Costs, and Security
    16. Athena Performance
    17. Athena ACID Transactions
    18. (Exercise) AWS Glue and Athena
    19. Redshift Introduction and Architecture
    20. Redshift Spectrum and Performance Tuning
    21. Redshift Durability and Scaling
    22. Redshift Distribution Styles
    23. Redshift Sort Keys
    24. Redshift Data Flows and the COPY command
    25. Redshift Integration / WLM / Vacuum / Anti-Patterns
    26. Redshift Resizing (Elastic Versus Classic) and New Redshift Features in 2020
    27. Newer Redshift Features, AQUA
    28. Redshift Security Concerns
    29. Redshift Serverless
    30. (Exercise) Redshift Spectrum, Part 1
    31. (Exercise) Redshift Spectrum, Part 2
    32. Amazon Relational Database Service (RDS) and Aurora
  6. Chapter 6 : Domain 5: Visualization
    1. Section Introduction: Visualization
    2. Introduction to Amazon QuickSight
    3. QuickSight Pricing and Dashboards; ML Insights
    4. QuickSight Q
    5. Choosing Visualization Types
    6. (Exercise) Amazon QuickSight
    7. Other Visualization Tools (HighCharts, D3, and More)
  7. Chapter 7 : Domain 6: Security
    1. Encryption 101
    2. S3 Encryption (Reminder)
    3. KMS Overview
    4. KMS Key Rotation
    5. Cloud HSM Overview
    6. AWS Services Security Deep Dive (1/3)
    7. AWS Services Security Deep Dive (2/3)
    8. AWS Services Security Deep Dive (3/3)
    9. STS and Cross Account Access
    10. Identity Federation
    11. Policies - Advanced
    12. CloudTrail
    13. VPC Endpoints
  8. Chapter 8 : Everything Else
    1. AWS Services Integrations
    2. Instance Types for Big Data
    3. EC2 for Big Data
    4. Interacting with Data with AWS AppSync and Amazon Kendra
    5. AWS Data Exchange
    6. Amazon AppFlow
  9. Chapter 9 : Preparing for the Exam
    1. Exam Tips
    2. State of Learning Checkpoint
    3. Exam Walkthrough and Signup
    4. Save 50% on Your AWS Exam Cost!
    5. Get an Extra 30 Minutes in Your AWS Exam - Non-Native English Speakers Only
  10. Chapter 10 : Appendix - Machine Learning Topics for the Legacy AWS Certified Big Data Exam
    1. Machine Learning 101
    2. Classification Models
    3. Amazon ML Service
    4. SageMaker
    5. Deep Learning 101
    6. (Exercise) Amazon Machine Learning, Part 1
    7. (Exercise) Amazon Machine Learning, Part 2
  11. Chapter 11 : Wrapping Up
    1. AWS Certification Paths
    2. Congratulations! Now, Make Sure You Are Ready

Product information

  • Title: AWS Certified Data Analytics Specialty (2023) Hands-on
  • Author(s): Frank Kane, Stéphane Maarek
  • Release date: December 2020
  • Publisher(s): Packt Publishing
  • ISBN: 9781838983383