Mastering MongoDB 4.x - Second Edition

Book description

Leverage the power of MongoDB 4.x to build and administer fault-tolerant database applications

Key Features

  • Master the new features and capabilities of MongoDB 4.x
  • Implement advanced data modeling, querying, and administration techniques in MongoDB
  • Includes rich case-studies and best practices followed by expert MongoDB developers

Book Description

MongoDB is the best platform for working with non-relational data and is considered to be the smartest tool for organizing data in line with business needs. The recently released MongoDB 4.x supports ACID transactions and makes the technology an asset for enterprises across the IT and fintech sectors.

This book provides expertise in advanced and niche areas of managing databases (such as modeling and querying databases) along with various administration techniques in MongoDB, thereby helping you become a successful MongoDB expert. The book helps you understand how the newly added capabilities function with the help of some interesting examples and large datasets. You will dive deeper into niche areas such as high-performance configurations, optimizing SQL statements, configuring large-scale sharded clusters, and many more. You will also master best practices in overcoming database failover, and master recovery and backup procedures for database security.

By the end of the book, you will have gained a practical understanding of administering database applications both on premises and on the cloud; you will also be able to scale database applications across all servers.

What you will learn

  • Perform advanced querying techniques such as indexing and expressions
  • Configure, monitor, and maintain a highly scalable MongoDB environment
  • Master replication and data sharding to optimize read/write performance
  • Administer MongoDB-based applications on premises or on the cloud
  • Integrate MongoDB with big data sources to process huge amounts of data
  • Deploy MongoDB on Kubernetes containers
  • Use MongoDB in IoT, mobile, and serverless environments

Who this book is for

This book is ideal for MongoDB developers and database administrators who wish to become successful MongoDB experts and build scalable and fault-tolerant applications using MongoDB. It will also be useful for database professionals who wish to become certified MongoDB professionals. Some understanding of MongoDB and basic database concepts is required to get the most out of this book.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Mastering MongoDB 4.x Second Edition
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the author
    2. About the reviewers
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Conventions used
    4. Get in touch
      1. Reviews
  6. Section 1: Basic MongoDB – Design Goals and Architecture
  7. MongoDB – A Database for Modern Web
    1. Technical requirements
    2. The evolution of SQL and NoSQL
      1. The evolution of MongoDB
        1. Major feature set for versions 1.0 and 1.2
        2. Version 2
        3. Version 3
        4. Version 4
      2. MongoDB for SQL developers
      3. MongoDB for NoSQL developers
    3. MongoDB's key characteristics and use cases
      1. Key characteristics
      2. Use cases for MongoDB
      3. MongoDB criticism
    4. MongoDB configuration and best practices
      1. Operational best practices
      2. Schema design best practices
      3. Best practices for write durability
      4. Best practices for replication
      5. Best practices for sharding
      6. Best practices for security
      7. Best practices for AWS
    5. Reference documentation
      1. MongoDB documentation
      2. Packt references
    6. Further reading
    7. Summary
  8. Schema Design and Data Modeling
    1. Relational schema design
      1. MongoDB schema design
        1. Read-write ratio
    2. Data modeling
      1. Data types
        1. Comparing different data types
          1. Date type
          2. ObjectId
    3. Modeling data for atomic operations
      1. Write isolation
      2. Read isolation and consistency
    4. Modeling relationships
      1. One-to-one
      2. One-to-many and many-to-many
    5. Modeling data for keyword searches
    6. Connecting to MongoDB
      1. Connecting using Ruby
        1. Mongoid ODM
        2. Inheritance with Mongoid models
      2. Connecting using Python
        1. PyMODM ODM
        2. Inheritance with PyMODM models
      3. Connecting using PHP
        1. Doctrine ODM
        2. Inheritance with Doctrine
    7. Summary
  9. Section 2: Querying Effectively
  10. MongoDB CRUD Operations
    1. CRUD using the shell
      1. Scripting for the mongo shell
        1. The differences between scripting for the mongo shell and using it directly
        2. Batch inserts using the shell
        3. Batch operations using the mongo shell
      2. Administration
        1. fsync
        2. compact
        3. currentOp and killOp
        4. collMod
        5. touch
      3. MapReduce in the mongo shell
        1. MapReduce concurrency
        2. Incremental MapReduce
        3. Troubleshooting MapReduce
      4. Aggregation framework
        1. SQL to aggregation
        2. Aggregation versus MapReduce
      5. Securing the shell
        1. Authentication and authorization
        2. Authorization with MongoDB
        3. Security tips for MongoDB
          1. Encrypting communication using TLS/SSL
          2. Encrypting data
          3. Limiting network exposure
          4. Firewalls and VPNs
          5. Auditing
          6. Using secure configuration options
      6. Authentication with MongoDB
        1. Enterprise Edition
          1. Kerberos authentication
          2. LDAP authentication
    2. Summary
  11. Advanced Querying
    1. MongoDB CRUD operations
      1. CRUD using the Ruby driver
        1. Creating documents
        2. Read
        3. Chaining operations in find()
        4. Nested operations
        5. Update
        6. Delete
        7. Batch operations
      2. CRUD in Mongoid
        1. Read
        2. Scoping queries
        3. Create, update, and delete
      3. CRUD using the Python driver
        1. Creating and deleting
        2. Finding documents
        3. Updating documents
      4. CRUD using PyMODM
        1. Creating documents
        2. Updating documents
        3. Deleting documents
        4. Querying documents
      5. CRUD using the PHP driver
        1. Creating and deleting
        2. BulkWrite
        3. Read
        4. Updating documents
      6. CRUD using Doctrine
        1. Creating, updating, and deleting
        2. Read
        3. Best practices
      7. Comparison operators
      8. Update operators
      9. Smart querying
        1. Using regular expressions
        2. Querying results and cursors
        3. Storage considerations for the delete operation
    2. Change streams
      1. Introduction
        1. Setup
        2. Using change streams
        3. Specification
        4. Important notes
        5. Production recommendations
        6. Replica sets
        7. Sharded clusters
    3. Summary
  12. Multi-Document ACID Transactions
    1. Background
    2. ACID
      1. Atomicity
      2. Consistency
      3. Isolation
        1. Phantom reads
        2. Non-repeatable reads
        3. Dirty reads
      4. Durability
      5. When do we need ACID in MongoDB ?
      6. Building a digital bank using MongoDB
        1. Setting up our data
        2. Transferring between accounts – part 1
        3. Transferring between accounts – part 2
        4. Transferring between accounts – part 3
    3. E-commerce using MongoDB
      1. The best practices and limitations of multi-document ACID transactions
    4. Summary
  13. Aggregation
    1. Why aggregation?
    2. Aggregation operators
      1. Aggregation stage operators
      2. Expression operators
        1. Expression Boolean operators
        2. Expression comparison operators
        3. Set expression and array operators
        4. Expression date operators
        5. Expression string operators
        6. Expression arithmetic operators
        7. Aggregation accumulators
        8. Conditional expressions
        9. Type conversion operators
        10. Other operators
          1. Text search
          2. Variable
          3. Literal
          4. Parsing data type
    3. Limitations
    4. Aggregation use case
    5. Summary
  14. Indexing
    1. Index internals
      1. Index types
        1. Single field indexes
        2. Dropping indexes
          1. Indexing embedded fields
          2. Indexing embedded documents
          3. Background indexes
        3. Compound indexes
          1. Sorting with compound indexes
          2. Reusing compound indexes
        4. Multikey indexes
        5. Special types of indexes
          1. Text indexes
          2. Hashed indexes
          3. Time to live indexes
          4. Partial indexes
          5. Sparse indexes
          6. Unique indexes
          7. Case-insensitive
          8. Geospatial indexes
          9. 2d geospatial indexes
          10. 2dsphere geospatial indexes
          11. geoHaystack indexes
      2. Building and managing indexes
        1. Forcing index usage
          1. Hint and sparse indexes
          2. Building indexes on replica sets
        2. Managing indexes
          1. Naming indexes
          2. Special considerations
      3. Using indexes efficiently
        1. Measuring performance
          1. Improving performance
          2. Index intersection
    2. Further reading
    3. Summary
  15. Section 3: Administration and Data Management
  16. Monitoring, Backup, and Security
    1. Monitoring
      1. What should we monitor?
        1. Page faults
        2. Resident memory
        3. Virtual and mapped memory
        4. Working sets
      2. Monitoring memory usage in WiredTiger
      3. Tracking page faults
      4. Tracking B-tree misses
        1. I/O wait
        2. Read and write queues
        3. Lock percentage
        4. Background flushes
        5. Tracking free space
        6. Monitoring replication
        7. Oplog size
      5. Working set calculations
      6. Monitoring tools
        1. Hosted tools
        2. Open source tools
    2. Backups
      1. Backup options
        1. Cloud-based solutions
        2. Backups with filesystem snapshots
        3. Making a backup of a sharded cluster
        4. Making backups using mongodump
        5. Backing up by copying raw files
        6. Making backups using queuing
      2. EC2 backup and restore
      3. Incremental backups
    3. Security
      1. Authentication
      2. Authorization
        1. User roles
        2. Database administration roles
        3. Cluster administration roles
        4. Backup and restore roles
        5. Roles across all databases
          1. Superuser
      3. Network-level security
      4. Auditing security
      5. Special cases
      6. Overview
    4. Summary
  17. Storage Engines
    1. Pluggable storage engines
      1. WiredTiger
        1. Document-level locking
        2. Snapshots and checkpoints
        3. Journaling
        4. Data compression
        5. Memory usage
        6. readConcern
        7. WiredTiger collection-level options
        8. WiredTiger performance strategies
        9. WiredTiger B-tree versus LSM indexes
      2. Encrypted
      3. In-memory
      4. MMAPv1
        1. MMAPv1 storage optimization
      5. Mixed usage
      6. Other storage engines
        1. RocksDB
        2. TokuMX
    2. Locking in MongoDB
      1. Lock reporting
      2. Lock yield
      3. Commonly used commands and locks
      4. Commands requiring a database lock
    3. Further reading
    4. Summary
  18. MongoDB Tooling
    1. Introduction
      1. MongoDB Atlas
        1. Creating a new cluster
          1. Important notes
      2. MongoDB Cloud Manager
      3. MongoDB Ops Manager
      4. MongoDB Charts
      5. MongoDB Compass
      6. MongoDB Connector for Business Intelligence (BI)
    2. An introduction to Kubernetes
      1. Enterprise Kubernetes Operator
      2. MongoDB Mobile
      3. MongoDB Stitch
        1. QueryAnywhere
          1. Rules
        2. Functions
        3. Triggers
      4. Mobile Sync
    3. Summary
  19. Harnessing Big Data with MongoDB
    1. What is big data?
      1. The big data landscape
      2. Message queuing systems
        1. Apache ActiveMQ
        2. RabbitMQ
        3. Apache Kafka
      3. Data warehousing
        1. Apache Hadoop
        2. Apache Spark
        3. Comparing Spark with Hadoop MapReduce
      4. MongoDB as a data warehouse
    2. A big data use case
      1. Setting up Kafka
      2. Setting up Hadoop
        1. Steps for Hadoop setup
      3. Using a Hadoop to MongoDB pipeline
      4. Setting up Spark to MongoDB
    3. Further reading
    4. Summary
  20. Section 4: Scaling and High Availability
  21. Replication
    1. Replication
      1. Logical or physical replication
      2. Different high availability types
    2. An architectural overview
    3. How do elections work?
    4. What is the use case for a replica set?
    5. Setting up a replica set
      1. Converting a standalone server into a replica set
      2. Creating a replica set
      3. Read preference
      4. Write concern
        1. Custom write concerns
      5. Priority settings for replica set members
        1. Zero priority replica set members
        2. Hidden replica set members
        3. Delayed replica set members
      6. Production considerations
    6. Connecting to a replica set
    7. Replica set administration
      1. How to perform maintenance on replica sets
      2. Re-syncing a member of a replica set
      3. Changing the oplog's size
      4. Reconfiguring a replica set when we have lost the majority of our servers
      5. Chained replication
    8. Cloud options for a replica set
      1. mLab
      2. MongoDB Atlas
    9. Replica set limitations
    10. Summary
  22. Sharding
    1. Why do we use sharding?
    2. Architectural overview
      1. Development, continuous deployment, and staging environments
      2. Planning ahead with sharding
    3. Sharding setup
      1. Choosing the shard key
        1. Changing the shard key
      2. Choosing the correct shard key
        1. Range-based sharding
        2. Hash-based sharding
        3. Coming up with our own key
        4. Location-based data
    4. Sharding administration and monitoring
      1. Balancing data – how to track and keep our data balanced
      2. Chunk administration
        1. Moving chunks
        2. Changing the default chunk size
        3. Jumbo chunks
        4. Merging chunks
        5. Adding and removing shards
      3. Sharding limitations
    5. Querying sharded data
      1. The query router
        1. Find
        2. Sort/limit/skip
        3. Update/remove
      2. Querying using Ruby
      3. Performance comparison with replica sets
    6. Sharding recovery
      1. mongos
      2. mongod
      3. Config server
      4. A shard goes down
      5. The entire cluster goes down
    7. Further reading
    8. Summary
  23. Fault Tolerance and High Availability
    1. Application design
      1. Schema-less doesn't mean schema design-less
      2. Read performance optimization
        1. Consolidating read querying
      3. Defensive coding
        1. Monitoring integrations
    2. Operations
    3. Security
      1. Enabling security by default
      2. Isolating our servers
      3. Checklists
    4. Further reading
    5. Summary
  24. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Mastering MongoDB 4.x - Second Edition
  • Author(s): Alex Giamas
  • Release date: March 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789617870