O'Reilly logo
live online training icon Live Online training

MongoDB First Steps

enter image description here

An introduction to ETL operations and the aggregation framework

Topic: Data
Axel Sirota

Document-based database MongoDB allows you to perform simple to complicated queries in just a couple of lines, including full-text search, geospatial queries, or even complex ETL (extract, transform, load) operations. It’s fast, scalable, reliable, schema-less, ACID-compliant, and so flexible you can visualize data as JSON files and write queries, transformations, and transactions as code—all of which makes MongoDB incredibly easy to debug and fast to develop with.

Expert Axel Sirota walks you through MongoDB basics and its concurrency model, then leads a deep dive into aggregations using both the aggregation framework and MapReduce. Join in to learn tips and tricks for adapting MongoDB into your own applications and benefit from its high availability, flexible data structures, and incredible flexibility.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • Why MongoDB is so popular
  • How to perform basic CRUD (create, read, update, delete) operations in MongoDB
  • MongoDB’s concurrency model and how it enables high availability and performance
  • Pros and cons of MongoDB’s aggregation framework and MapReduce, and when to use each

And you’ll be able to:

  • Create complex ETL transformations to leverage the most out of your raw data
  • Create any type of application that can benefit and leverage MongoDB as a database
  • Optimize queries to boost their performance 30x

This training course is for you because...

  • You’re a data engineer who needs to add and maintain complex ETL aggregations over your data pipeline.
  • You work with a product team and need to create clean reports out of raw data to understand business needs.
  • You’re a DBA who needs to optimize a MongoDB deployment to boost its performance and reliability.

Prerequisites

  • Familiarity with basic SQL commands (SELECT, WHERE, INSERT, UPDATE, DELETE, etc.)

Recommended preparation:

Recommended follow-up:

About your instructor

  • Axel Sirota has a Masters degree in Mathematics with a deep interest in Deep Learning and Machine Learning Operations. After researching in Probability, Statistics and Machine Learning optimisation, he is currently working at JAMPP as a Machine Learning Research Engineer leveraging customer data for making accurate predictions at Real Time Bidding.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction (20 minutes)

  • Presentation: What is MongoDB, and why has it become so famous?

CRUD operations in MongoDB (50 minutes)

  • Presentation: Performing FIND, INSERT, UPDATE and DELETE operations in MongoDB; concurrency in MongoDB Katacoda interactive exercises: CRUD operations with MongoDB; first steps in transactions in MongoDB
  • Group discussion: ACID operations in MongoDB—transactions
  • Q&A

Break (10 minutes)

Aggregation in MongoDB (95 minutes)

  • Presentation: MapReduce versus the aggregation framework; stages of the aggregation framework
  • Hands-on exercise and live demo: Perform an aggregation in two ways, with MapReduce and with the aggregation framework
  • Katacoda interactive exercises: Getting started with grouping in MongoDB; advanced ETL with projections in MongoDB; handling arrays with unwind in MongoDB
  • Group discussion: aggregation framework further resources
  • Q&A

Break (10 minutes)

Aggregation in MongoDB: Advanced topics (45 minutes)

  • Presentation: Concurrency and performance comparisons in aggregations
  • Hands-on exercise and live demo: Optimize and analyze the query planner from the shell; explore an aggregation pipeline query with several performance points to improve, including $project + $match optimization, $lookup + $unwind coalescence, and the creation of targeted indexes with $hint
  • Group discussion: Incremental datasets with MapReduce

Wrap-up and Q&A (10 minutes)