O'Reilly logo
live online training icon Live Online training

AWS Certified Big Data - Specialty Crash Course

Noah Gift

In this online course, Noah Gift will cover how to prepare for one of the hottest certification exams in 2019, the AWS Big Data Certification. This training will cover the six core areas of the certification: Collection, Storage, Processing, Analysis, Visualization, and Data Security. The final portion of the course will cover real-world case studies of Big Data problems on AWS.

Data Science is one of the hottest jobs, but it is often said that you need five data engineers to support one data scientist. The reason for this metric is that data engineering is a challenging job that requires cutting edge skills that are often platform specific. This course helps data engineers round out their knowledge of AWS Big Data fundamentals so they can achieve certification on the AWS platform.

What you'll learn-and how you can apply it

In this course, you will learn how to:

  • perform collection tasks on AWS
  • use the appropriate storage solution for Big Data on AWS
  • perform processing tasks on the AWS platform
  • couple Visualization, Analysis and Data Security to reason about Big Data on AWS
  • think about the AWS Big Data Certification exam to optimize for the best outcome.

This training course is for you because...

You are a:

  • DevOps Engineer who wants to understand how to operationalize Big Data workloads.
  • Software Engineer who wants to master Big Data terminology and practices on AWS.
  • Machine Learning Engineer who wants to solidify their knowledge of AWS Big Data practices.
  • Product Manager who needs to understand the AWS Big Data lifecycle.
  • Data Scientist who runs Big Data workloads on AWS.

Prerequisites

  • 1-2 years of experience with AWS and six months using Big Data tools: Spark, Hadoop, and Python.
  • Ideally, candidates would have already passed the AWS Cloud Practitioner cert.

Course Set-up

Create a free AWS Account: https://aws.amazon.com

Recommended Preparation

Recommended Follow-up

  • https://learning.oreilly.com/videos/aws-certified-machine/9780135556597](https://learning.oreilly.com/videos/aws-certified-machine/9780135556597)

About your instructor

  • Noah Gift is lecturer and consultant at both UC Davis Graduate School of Management MSBA program and the Graduate Data Science program, MSDS, at Northwestern. He is teaching and designing graduate machine learning, AI, Data Science courses and consulting on Machine Learning and Cloud Architecture for students and faculty. These responsibilities including leading a multi-cloud certification initiative for students. He has published close to 100 technical publications including two books on subjects ranging from Cloud Machine Learning to DevOps. Gift received an MBA from UC Davis, a M.S. in Computer Information Systems from Cal State Los Angeles, and a B.S. in Nutritional Science from Cal Poly San Luis Obispo.

    Professionally, Noah has approximately 20 years’ experience programming in Python. He is a Python Software Foundation Fellow, AWS Subject Matter Expert (SME) on Machine Learning, AWS Certified Solutions Architect and AWS Academy Accredited Instructor, Google Certified Professional Cloud Architect, Microsoft MTA on Python. He has worked in roles ranging from CTO, General Manager, Consulting CTO and Cloud Architect. This experience has been with a wide variety of companies including ABC, Caltech, Sony Imageworks, Disney Feature Animation, Weta Digital, AT&T, Turner Studios and Linden Lab. In the last ten years, he has been responsible for shipping many new products at multiple companies that generated millions of dollars of revenue and had global scale. Currently he is consulting startups and other companies.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Day 1

Part 1: AWS Machine Learning-Specialty (ML-S) Certification Overview, Collection, and Storage

Length (90 min) - Get an overview of the certification - Use exam study resources - Review the exam guide - Learn the exam strategy - Learn the best practices of Big Data on AWS - Learn the techniques to accelerate hands-on practice - Understand important Big Data related services - Determine the operational characteristics of the collection system - Determine and optimize the operational characteristics of the storage solution

Q&A (15 min)

Break (15 min)

Part 2: Processing for Big Data on AWS

Length (45 min) - Identify the appropriate data processing technology for a given scenario - Determine how to design and architect the data processing solution - Determine the operational characteristics of the solution implemented - Understand Overview of AWS Processing - Understand Elastic MapReduce (EMR) - Learn about Apache Hadoop - Intro - Apply EMR - Architecture

Q&A (10 min)

Break (5 min)

Part 3: Analysis for Big Data on AWS

Length (45 min) - Determine the tools and techniques required for analysis - Determine how to design and architect the analytical solution - Determine and optimize the operational characteristics of the Analysis - Understand Redshift Overview - Learn Redshift Design - Use Redshift Data Ingestion - Apply Redshift Operations - Use AWS Elasticsearch - operational analytics - Implement Machine Learning - Clustering & Regression - Use AWS Athena - interactive analytics

Q&A (15 min)

Day 2

Part 4: Visualization & Data Security for Big Data on AWS

Length (90 min) - Determine the appropriate techniques for delivering the results/output - Determine how to design and create the Visualization platform - Determine and optimize the operational characteristics of the Visualization system - Understand AWS Visualization - Overview - Use AWS Quicksight - dashboards & visualizations - Determine encryption requirements and/or implementation technologies - Choose the appropriate technology to enforce data governance - Identify how to ensure data integrity - Evaluate regulatory requirements - Implement AWS IAM - Implement EMR Security - Implement Redshift Security

Q&A (15 min)

Break (15 min)

Part 5: Case Studies Part 1

Length (45 min) - Understand Big Data for Sagemaker - Learn Sagemaker and EMR Integration - Learn Serverless Production Big Data Application Development

Q&A (10 min)

Break (5 min)

Part 6: Case Studies Part 2 and Exam Sample Questions Review

Length (45 min) - Implement Containerization for Big Data - Implement Spot Instances for Big Data Pipeline - Exam Review

Q&A (15 min)