book

MongoDB: The Definitive Guide

by Kristina Chodorow, Michael Dirolf

September 2010

Intermediate to advanced

214 pages

5h 30m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Foreword
Preface
How This Book Is OrganizedGetting Up to Speed with MongoDBDeveloping with MongoDBAdvanced UsageAdministrationDeveloping Applications with MongoDBAppendixesConventions Used in This BookUsing Code ExamplesSafari® Books OnlineHow to Contact UsAcknowledgmentsAcknowledgments from KristinaAcknowledgments from Michael
1. Introduction
A Rich Data ModelEasy ScalingTons of Features……Without Sacrificing SpeedSimple AdministrationBut Wait, That’s Not All…
2. Getting Started
DocumentsCollectionsSchema-FreeNamingSubcollectionsDatabasesGetting and Starting MongoDBMongoDB ShellRunning the ShellA MongoDB ClientBasic Operations with the ShellCreateReadUpdateDeleteTips for Using the ShellInconvenient collection namesData TypesBasic Data TypesNumbersDatesArraysEmbedded Documents_id and ObjectIdsObjectIdsAutogeneration of _id
3. Creating, Updating, and Deleting Documents
Inserting and Saving DocumentsBatch InsertInserts: Internals and ImplicationsRemoving DocumentsRemove SpeedUpdating DocumentsDocument ReplacementUsing ModifiersGetting started with the “$set” modifierIncrementing and decrementingArray modifiersPositional array modificationsModifier speedUpsertsThe save Shell HelperUpdating Multiple DocumentsReturning Updated DocumentsThe Fastest Write This Side of MississippiSafe OperationsCatching “Normal” ErrorsRequests and Connections
4. Querying
Introduction to findSpecifying Which Keys to ReturnLimitationsQuery CriteriaQuery ConditionalsOR Queries$notRules for ConditionalsType-Specific QueriesnullRegular ExpressionsQuerying Arrays$all$sizeThe $slice operatorQuerying on Embedded Documents$where QueriesCursorsLimits, Skips, and SortsComparison orderAvoiding Large SkipsPaginating results without skipFinding a random documentAdvanced Query OptionsGetting Consistent ResultsCursor Internals
5. Indexing
Introduction to IndexingScaling IndexesIndexing Keys in Embedded DocumentsIndexing for SortsUniquely Identifying IndexesUnique IndexesDropping DuplicatesCompound Unique IndexesUsing explain and hintIndex AdministrationChanging IndexesGeospatial IndexingCompound Geospatial IndexesThe Earth Is Not a 2D Plane
6. Aggregation
countdistinctgroupUsing a FinalizerUsing a Function as a KeyMapReduceExample 1: Finding All Keys in a CollectionExample 2: Categorizing Web PagesMongoDB and MapReduceThe finalize functionKeeping output collectionsMapReduce on a subset of documentsUsing a scopeGetting more output
7. Advanced Topics
Database CommandsHow Commands WorkCommand ReferenceCapped CollectionsProperties and Use CasesCreating Capped CollectionsSorting Au NaturelTailable CursorsGridFS: Storing FilesGetting Started with GridFS: mongofilesWorking with GridFS from the MongoDB DriversUnder the HoodServer-Side Scriptingdb.evalStored JavaScriptSecurityDatabase ReferencesWhat Is a DBRef?Example SchemaDriver Support for DBRefsWhen Should DBRefs Be Used?
8. Administration
Starting and Stopping MongoDBStarting from the Command LineFile-Based ConfigurationStopping MongoDBMonitoringUsing the Admin InterfaceserverStatusmongostatThird-Party Plug-InsSecurity and AuthenticationAuthentication BasicsHow Authentication WorksOther Security ConsiderationsBackup and RepairData File Backupmongodump and mongorestorefsync and LockSlave BackupsRepair

9. Replication
Master-Slave ReplicationOptionsAdding and Removing SourcesReplica SetsInitializing a SetNodes in a Replica SetFailover and Primary ElectionPerforming Operations on a SlaveRead ScalingUsing Slaves for Data ProcessingHow It WorksThe OplogSyncingReplication State and the Local DatabaseBlocking for ReplicationAdministrationDiagnosticsChanging the Oplog SizeReplication with Authentication
10. Sharding
Introduction to ShardingAutosharding in MongoDBWhen to ShardThe Key to Sharding: Shard KeysSharding an Existing CollectionIncrementing Shard Keys Versus Random Shard KeysHow Shard Keys Affect OperationsSetting Up ShardingStarting the ServersAdding a shardSharding DataProduction ConfigurationA Robust ConfigMany mongosA Sturdy ShardPhysical ServersSharding Administrationconfig CollectionsShardsDatabasesChunksSharding CommandsGetting a summaryRemoving a shard
11. Example Applications
Chemical Search Engine: JavaInstalling the Java DriverUsing the Java DriverSchema DesignWriting This in JavaIssuesNews Aggregator: PHPInstalling the PHP DriverWindows installMac OS X InstallLinux and Unix installUsing the PHP DriverDesigning the News AggregatorTrees of CommentsVotingCustom Submission Forms: RubyInstalling the Ruby DriverUsing the Ruby DriverCustom Form SubmissionRuby Object Mappers and Using MongoDB with RailsReal-Time Analytics: PythonInstalling PyMongoUsing PyMongoMongoDB for Real-Time AnalyticsSchemaHandling a RequestUsing Analytics DataOther Considerations
A. Installing MongoDB
Choosing a VersionWindows InstallInstalling as a ServicePOSIX (Linux, Mac OS X, and Solaris) InstallInstalling from a Package Manager
B. mongo: The Shell
Shell Utilities
C. MongoDB Internals
BSONWire ProtocolData FilesNamespaces and ExtentsMemory-Mapped Storage Engine
Index
About the Authors
Colophon
Copyright

Content preview from MongoDB: The Definitive Guide

Chapter 10. Sharding

Sharding is MongoDB’s approach to scaling out. Sharding allows you to add more machines to handle increasing load and data size without affecting your application.

Introduction to Sharding

Sharding refers to the process of splitting data up and storing different portions of the data on different machines; the term partitioning is also sometimes used to describe this concept. By splitting data up across machines, it becomes possible to store more data and handle more load without requiring large or powerful machines.

Manual sharding can be done with almost any database software. It is when an application maintains connections to several different database servers, each of which are completely independent. The application code manages storing different data on different servers and querying against the appropriate server to get data back. This approach can work well but becomes difficult to maintain when adding or removing nodes from the cluster or in the face of changing data distributions or load patterns.

MongoDB supports autosharding, which eliminates some of the administrative headaches of manual sharding. The cluster handles splitting up data and rebalancing automatically. Throughout the rest of this book (and most MongoDB documentation in general), the terms sharding and autosharding are used interchangeably, but it’s important to note the difference between that and manual sharding in an application.

Autosharding in MongoDB

The basic concept behind MongoDB’s ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

MongoDB - The Complete Developer's Guide

Publisher Resources

ISBN: 9781449381578Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design