Practical MongoDB Aggregations

Book description

Begin your journey toward efficient data manipulation with this robust technical guide and enhance your aggregation skills while building efficient pipelines for a variety of tasks

Key Features

  • Build effective aggregation pipelines for increased productivity and performance
  • Solve common data manipulation and analysis problems with the help of practical examples
  • Learn essential strategies to aggregate time series data in financial datasets and IoT
  • Purchase of the print or Kindle book includes a free PDF eBook

Book Description

Officially endorsed by MongoDB, Inc., Practical MongoDB Aggregations helps you unlock the full potential of the MongoDB aggregation framework, including the latest features of MongoDB 7.0. This book provides practical, easy-to-digest principles and approaches for increasing your effectiveness in developing aggregation pipelines, supported by examples for building pipelines to solve complex data manipulation and analytical tasks.

This book is customized for developers, architects, data analysts, data engineers, and data scientists with some familiarity with the aggregation framework. It begins by explaining the framework's architecture and then shows you how to build pipelines optimized for productivity and scale.

Given the critical role arrays play in MongoDB's document model, the book delves into best practices for optimally manipulating arrays. The latter part of the book equips you with examples to solve common data processing challenges so you can apply the lessons you've learned to practical situations. By the end of this MongoDB book, you’ll have learned how to utilize the MongoDB aggregation framework to streamline your data analysis and manipulation processes effectively.

What you will learn

  • Develop dynamic aggregation pipelines tailored to changing business requirements
  • Master essential techniques to optimize aggregation pipelines for rapid data processing
  • Achieve optimal efficiency for applying aggregations to vast datasets with effective sharding strategies
  • Eliminate the performance penalties of processing data externally by filtering, grouping, and calculating aggregated values directly within the database
  • Use pipelines to help you secure your data access and distribution

Who this book is for

This book is for intermediate-level developers, architects, analysts, engineers, and data scientists who are interested in learning about aggregation capabilities in MongoDB. Working knowledge of MongoDB is needed to get the most out of this book.

Table of contents

  1. First edition
  2. Acknowledgements
  3. Foreword
  4. Preface
    1. How will this book help you?
    2. Who this book is for
    3. What this book covers
    4. To get the most out of this book
    5. Download the example code files
    6. Conventions used
    7. Get in touch
    8. Download a free PDF copy of this book
  5. Chapter 1: MongoDB Aggregations Explained
    1. What is the MongoDB aggregation framework?
    2. What is the MongoDB aggregation language?
    3. What do developers use the aggregation framework for?
    4. A short history of MongoDB aggregations
      1. Aggregation capabilities in MongoDB server releases
    5. Getting going
      1. Setting up your environment
      2. Database
      3. Client tool
    6. Getting further help
    7. Summary
  6. Part 1: Guiding Tips and Principles
  7. Chapter 2: Optimizing Pipelines for Productivity
    1. Embrace composability for increased productivity
      1. Guiding principles to promote composability
      2. Using macro functions
      3. So, what's the best way of factoring out code?
    2. Better alternatives for a projection stage
      1. When to use $set and $unset
      2. When to use $project
      3. The hidden danger of $project
      4. Key projection takeaways
    3. Summary
  8. Chapter 3: Optimizing Pipelines for Performance
    1. Using explain plans to identify performance bottlenecks
      1. Viewing an explain plan
      2. Understanding the explain plan
    2. Guidance for optimizing pipeline performance
      1. Be cognizant of streaming vs blocking stages ordering
      2. Avoid unwinding and regrouping documents just to process each array's elements
      3. Encourage match filters to appear early in the pipeline
    3. Summary
  9. Chapter 4: Harnessing the Power of Expressions
    1. Aggregation expressions explained
    2. What do expressions produce?
      1. Chaining operator expressions together
    3. Can all stages use expressions?
      1. What is using $expr inside $match all about?
      2. Restrictions when using expressions within $match
    4. Advanced use of expressions for array processing
      1. if-else conditional comparison
      2. The power array operators
      3. for-each looping to transform an array
      4. for-each looping to compute a summary value from an array
      5. for-each looping to locate an array element
      6. Reproducing $map behavior using $reduce
      7. Adding new fields to existing objects in an array
      8. Rudimentary schema reflection using arrays
    5. Summary
  10. Chapter 5: Optimizing Pipelines for Sharded Clusters
    1. A brief summary of MongoDB sharded clusters
    2. Sharding implications for pipelines
      1. Sharded aggregation constraints
    3. Where does a sharded aggregation run?
      1. Pipeline splitting at runtime
      2. Execution of the split pipeline shards
      3. Execution of the merger part of the split pipeline
      4. Difference in merging behavior for grouping versus sorting
    4. Performance tips for sharded aggregations
    5. Summary
  11. Part 2: Aggregations by Example
  12. Chapter 6: Foundational Examples: Filtering, Grouping, and Unwinding
    1. Filtered top subset
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline results
      6. Pipeline observations
    2. Group and total
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Unpack arrays and group differently
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline results
      6. Pipeline observations
    4. Distinct list of values
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    5. Summary
  13. Chapter 7: Joining Data Examples
    1. One-to-one join
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    2. Multi-field join and one-to-many
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Summary
  14. Chapter 8: Fixing and Generating Data Examples
    1. Strongly typed conversion
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    2. Converting incomplete date strings
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Generating mock test data
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    4. Summary
  15. Chapter 9: Trend Analysis Examples
    1. Faceted classification
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    2. Largest graph network
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Incremental analytics
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    4. Summary
  16. Chapter 10: Securing Data Examples
    1. Redacted view
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    2. Mask sensitive fields
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Role programmatic restricted view
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    4. Summary
  17. Chapter 11: Time-Series Examples
    1. IoT power consumption
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    2. State change boundaries
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Summary
  18. Chapter 12: Array Manipulation Examples
    1. Summarizing arrays for first, last, minimum, maximum, and average values
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    2. Pivoting array items by a key
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Array sorting and percentiles
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    4. Array element grouping
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    5. Array fields joining
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    6. Comparison of two arrays
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    7. Jagged array condensing
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    8. Summary
  19. Chapter 13: Full-Text Search Examples
    1. What is Atlas Search?
    2. Compound text search criteria
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    3. Facets and counts text search
      1. Scenario
      2. Populating the sample data
      3. Defining the aggregation pipeline
      4. Executing the aggregation pipeline
      5. Expected pipeline result
      6. Pipeline observations
    4. Summary
  20. Appendix
    1. Create an Atlas Search index
  21. Afterword
  22. Index
    1. Why subscribe?
  23. Other books you may enjoy
    1. Packt is searching for authors like you
    2. Download a free PDF copy of this book

Product information

  • Title: Practical MongoDB Aggregations
  • Author(s): Paul Done
  • Release date: September 2023
  • Publisher(s): Packt Publishing
  • ISBN: 9781835080641