Book description
Hadoop has revolutionized data processing and enterprise data warehousing, but its explosive growth has come with a large amount of uncertainty, hype, and confusion. With this report, enterprise decision makers will receive a concise crash course on what Hadoop is and why it’s important.
Hadoop represents a major shift from traditional enterprise data warehousing and data analytics, and its technology can be daunting at first. Donald Miner, founder of the data science firm Miner & Kasch, covers just enough ground so you can make intelligent decisions about Hadoop in your enterprise.
By the end of this report, you’ll know the basics of technologies such as HDFS, MapReduce, and YARN, without becoming mired in the details. Not only will you learn the basics of how Hadoop works and why it’s such an important technology, you’ll get examples of how you should probably be using it.
Table of contents
-
Hadoop: What You Need to Know
- An Introduction to Hadoop and the Hadoop Ecosystem
- Hadoop Masks Being a Distributed System
- Hadoop Scales Out Linearly
- Hadoop Runs on Commodity Hardware
- Hadoop Handles Unstructured Data
- In Hadoop You Load Data First and Ask Questions Later
- Hadoop is Open Source
- The Hadoop Distributed File System Stores Data in a Distributed, Scalable, Fault-Tolerant Manner
- YARN Allocates Cluster Resources for Hadoop
- MapReduce is a Framework for Analyzing Data
- Summary
- Further Reading
Product information
- Title: Hadoop: What You Need to Know
- Author(s):
- Release date: March 2016
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491937303
You might also like
book
Real-World Hadoop
If you’re a business team leader, CIO, business analyst, or developer interested in how Apache Hadoop …
book
Apache Hadoop 3 Quick Start Guide
A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem …
book
Moving Hadoop to the Cloud
Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you …
book
Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem
Get Started Fast with Apache Hadoop ® 2, YARN, and Today’s Hadoop Ecosystem With Hadoop 2.x …