IntroductionDistributed data processingBig data processing requirementsTechnologies for big data processingMapReduceMapReduce programming modelMapReduce Google architectureHistoryHadoop core componentsNameNodeDataNodeImageJournalCheckpointHDFS startupBlock allocation and storageHDFS clientReplication and recoveryNameNode and DataNode—communication and managementHeartbeatsCheckPointNode and BackupNodeCheckPointNodeBackupNodeFilesystem snapshotsYARN scalabilityYARN execution flowZookeeper featuresLocks and processingFailure and recoveryProgramming with Pig LatinPig data typesRunning Pig programsPig program flowCommon Pig commandHBASE architectureHBASE architecture implementationHive architectureExecution—how does Hive process queries?Hive data typesHive examplesHCatalogCAP theoremA keyspace has configurable properties that are critical to understandCassandra ring architectureThe design features of document-oriented databases include the following: