book

Planning for Big Data

Name: Planning for Big Data
Author: Edd Wilder-James
ISBN: 9781449329648

by Edd Wilder-James

March 2012

Beginner to intermediate

83 pages

1h 52m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Planning for Big Data
Introduction
1. The Feedback Economy
Data-Obese, Digital-FastThe Big Data Supply ChainData collectionIngesting and cleaningHardwarePlatformsMachine learningHuman explorationStorageSharing and actingMeasuring and collecting feedbackReplacing Everything with DataA Feedback Economy
2. What Is Big Data?
What Does Big Data Look Like?VolumeVelocityVarietyIn PracticeCloud or in-house?Big data is bigBig data is messyCultureKnow where you want to go
3. Apache Hadoop
The Core of Hadoop: MapReduceHadoop’s Lower Levels: HDFS and MapReduceImproving Programmability: Pig and HiveImproving Data Access: HBase, Sqoop, and FlumeGetting data in and outCoordination and Workflow: Zookeeper and OozieManagement and Deployment: Ambari and WhirrMachine Learning: MahoutUsing Hadoop
4. Big Data Market Survey
Just Hadoop?Integrated Hadoop SystemsEMC GreenplumIBMMicrosoftOracleAvailabilityAnalytical Databases with Hadoop ConnectivityQuick factsHadoop-Centered CompaniesClouderaHortonworksAn overview of Hadoop distributions (part 1)An overview of Hadoop distributions (part 2)Notes
5. Microsoft’s Plan for Big Data
Microsoft’s Hadoop DistributionDevelopers, Developers, DevelopersStreaming Data and NoSQLToward an Integrated EnvironmentThe Data MarketplaceSummary
6. Big Data in the Cloud
IaaS and Private CloudsPlatform solutionsAmazon Web ServicesElastic Map ReduceDynamoDBGoogleBigQueryPrediction APIMicrosoftBig data cloud platforms comparedConclusionNotes
7. Data Marketplaces
What Do Marketplaces Do?InfochimpsFactualWindows Azure Data MarketplaceDataMarketData Markets ComparedOther Data Suppliers
8. The NoSQL Movement
Size, Response, AvailabilityChanging Data and Cheap LunchesThe Sacred CowsOther featuresIn the End

9. Why Visualization Matters
A Picture Is Worth 1000 RowsTypes of VisualizationExplaining and exploringYour Customers Make Decisions, TooDo Yourself a Favor and Hire a Designer
10. The Future of Big Data
More Powerful and Expressive Tools for AnalysisStreaming Data ProcessingRise of Data MarketplacesDevelopment of Data Science Workflows and ToolsIncreased Understanding of and Demand for Visualization
About the Author
Copyright

Content preview from Planning for Big Data

Chapter 1. The Feedback Economy

By Alistair Croll

Military strategist John Boyd spent a lot of time understanding how to win battles. Building on his experience as a fighter pilot, he broke down the process of observing and reacting into something called an Observe, Orient, Decide, and Act (OODA) loop. Combat, he realized, consisted of observing your circumstances, orienting yourself to your enemy’s way of thinking and your environment, deciding on a course of action, and then acting on it.

The Observe, Orient, Decide, and Act (OODA) loop. Larger version available here..

The most important part of this loop isn’t included in the OODA acronym, however. It’s the fact that it’s a loop. The results of earlier actions feed back into later, hopefully wiser, ones. Over time, the fighter “gets inside” their opponent’s loop, outsmarting and outmaneuvering them. The system learns.

Boyd’s genius was to realize that winning requires two things: being able to collect and analyze information better, and being able to act on that information faster, incorporating what’s learned into the next iteration. Today, what Boyd learned in a cockpit applies to nearly everything we do.

Data-Obese, Digital-Fast

In our always-on lives we’re flooded with cheap, abundant information. We need to capture and analyze it well, separating digital wheat from digital chaff, identifying meaningful undercurrents while ignoring ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781449333348Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Planning for Big Data

by Edd Wilder-James