O'Reilly logo
live online training icon Live Online training

AWS Certified Big Data - Specialty Crash Course

Noah Gift

In this online course, Noah Gift will cover how to prepare for one of the hottest certification exams in 2019, the AWS Big Data Certification. This training will cover the six core areas of the certification: Collection, Storage, Processing, Analysis, Visualization, and Data Security. The final portion of the course will cover real-world case studies of Big Data problems on AWS.

Data Science is one of the hottest jobs, but it is often said that you need five data engineers to support one data scientist. The reason for this metric is that data engineering is a challenging job that requires cutting edge skills that are often platform specific. This course helps data engineers round out their knowledge of AWS Big Data fundamentals so they can achieve certification on the AWS platform.

What you'll learn-and how you can apply it

In this course, you will learn how to:

  • perform collection tasks on AWS
  • use the appropriate storage solution for Big Data on AWS
  • perform processing tasks on the AWS platform
  • couple Visualization, Analysis and Data Security to reason about Big Data on AWS
  • think about the AWS Big Data Certification exam to optimize for the best outcome.

This training course is for you because...

You are a:

  • DevOps Engineer who wants to understand how to operationalize Big Data workloads.
  • Software Engineer who wants to master Big Data terminology and practices on AWS.
  • Machine Learning Engineer who wants to solidify their knowledge of AWS Big Data practices.
  • Product Manager who needs to understand the AWS Big Data lifecycle.
  • Data Scientist who runs Big Data workloads on AWS.


  • 1-2 years of experience with AWS and six months using Big Data tools: Spark, Hadoop, and Python.
  • Ideally, candidates would have already passed the AWS Cloud Practitioner cert.

Course Set-up

Create a free AWS Account: https://aws.amazon.com

Recommended Preparation

Recommended Follow-up

  • https://learning.oreilly.com/videos/aws-certified-machine/9780135556597](https://learning.oreilly.com/videos/aws-certified-machine/9780135556597)

About your instructor

  • Noah Gift is a lecturer in the University of California, Berkeley, graduate data science program, the Northwestern University graduate data science program, and the MSBA program at the University of California, Davis, Graduate School of Management. He consults with startups and other companies on machine learning and cloud architecture and does CTO-level consulting as the founder of Pragmatic AI Labs. Noah has approximately 20 years’ experience programming in Python and is a Python Software Foundation Fellow. Previously, he worked for a variety of companies in roles such as CTO, general manager, consulting CTO, and cloud architect. He’s published over 100 technical publications, including books on cloud machine learning and DevOps, for O’Reilly, Pearson, DataCamp, Udacity, and other publishers. He’s also a certified AWS Solutions Architect. Noah earned an MBA from the University of California, Davis, an MS in computer information systems from California State University, Los Angeles, and a BS in nutritional science from Cal Poly, in San Luis Obispo. You can find more about Noah by following him on GitHub, visiting his website, or connecting with him on LinkedIn.


The timeframes are only estimates and may vary according to how the class is progressing

Day 1

Part 1: AWS Machine Learning-Specialty (ML-S) Certification Overview, Collection, and Storage

Length (90 min) - Get an overview of the certification - Use exam study resources - Review the exam guide - Learn the exam strategy - Learn the best practices of Big Data on AWS - Learn the techniques to accelerate hands-on practice - Understand important Big Data related services - Determine the operational characteristics of the collection system - Determine and optimize the operational characteristics of the storage solution

Q&A (15 min)

Break (15 min)

Part 2: Processing for Big Data on AWS

Length (45 min) - Identify the appropriate data processing technology for a given scenario - Determine how to design and architect the data processing solution - Determine the operational characteristics of the solution implemented - Understand Overview of AWS Processing - Understand Elastic MapReduce (EMR) - Learn about Apache Hadoop - Intro - Apply EMR - Architecture

Q&A (10 min)

Break (5 min)

Part 3: Analysis for Big Data on AWS

Length (45 min) - Determine the tools and techniques required for analysis - Determine how to design and architect the analytical solution - Determine and optimize the operational characteristics of the Analysis - Understand Redshift Overview - Learn Redshift Design - Use Redshift Data Ingestion - Apply Redshift Operations - Use AWS Elasticsearch - operational analytics - Implement Machine Learning - Clustering & Regression - Use AWS Athena - interactive analytics

Q&A (15 min)

Day 2

Part 4: Visualization & Data Security for Big Data on AWS

Length (90 min) - Determine the appropriate techniques for delivering the results/output - Determine how to design and create the Visualization platform - Determine and optimize the operational characteristics of the Visualization system - Understand AWS Visualization - Overview - Use AWS Quicksight - dashboards & visualizations - Determine encryption requirements and/or implementation technologies - Choose the appropriate technology to enforce data governance - Identify how to ensure data integrity - Evaluate regulatory requirements - Implement AWS IAM - Implement EMR Security - Implement Redshift Security

Q&A (15 min)

Break (15 min)

Part 5: Case Studies Part 1

Length (45 min) - Understand Big Data for Sagemaker - Learn Sagemaker and EMR Integration - Learn Serverless Production Big Data Application Development

Q&A (10 min)

Break (5 min)

Part 6: Case Studies Part 2 and Exam Sample Questions Review

Length (45 min) - Implement Containerization for Big Data - Implement Spot Instances for Big Data Pipeline - Exam Review

Q&A (15 min)