Deploy multi-node Hadoop clusters to harness the Cloud for storage and large-scale data processing
About This Video
Familiarize yourself with Hadoop and its services, and how to configure them
Deploy compute instances and set up a three-node Hadoop cluster on Amazon
Set up a Linux installation optimized for Hadoop
Hadoop is an Apache top-level project that allows the distributed processing of large data sets across clusters of computers using simple programming models. It allows you to deliver a highly available service on top of a cluster of computers, each of which may be prone to failures. While Big Data and Hadoop have seen a massive surge in popularity over the last few years, many companies still struggle with trying to set up their own computing clusters.
This video series will turn you from a faltering first-timer into a Hadoop pro through clear, concise descriptions that are easy to follow.
We'll begin this course with an overview of Amazon's cloud service and its use. We'll then deploy Linux compute instances and you'll see how to connect your client machine to Linux hosts and configure your systems to run Hadoop. Finally, you'll install Hadoop, download data, and examine how to run a query.
This video series will go beyond just Hadoop; it will cover everything you need to get your own clusters up and running. You will learn how to make network configuration changes as well as modify Linux services. After you've installed Hadoop, we'll then go over installing HUEHadoop's UI. Using HUE, you will learn how to download data to your Hadoop clusters, move it to HDFS, and finally query that data with Hive.
Learn everything you need to deploy Hadoop clusters to the Cloud through these videos. You'll grasp all you need to know about handling large data sets over multiple nodes.
Table of contents
- Chapter 1 : Deploying Cloud Instances for Hadoop 2.0
- Chapter 2 : Setting Up Network and Security Settings
- Chapter 3 : Connecting to Cloud Instances
- Chpater 4 : Setting Up Network Connectivity and Access for Hadoop Clusters
- Chapter 5 : Setting Up Configuration Settings across Hadoop Clusters
- Chapter 6 : Creating a Hadoop Cluster
- Chapter 7 : Loading and Navigating the Hadoop File System (HDFS)
- Chapter 8 : Hadoop Tools and Processing Files
- Title: Building Hadoop Clusters
- Release date: May 2014
- Publisher(s): Packt Publishing
- ISBN: 9781783284030
You might also like
Apache Spark with Python - Big Data with PySpark and Spark
Learn Apache Spark and Python by 12+ hands-on examples of analyzing big data with PySpark and …
Hadoop Administration and Cluster Management
Planning, deploying, managing, monitoring and performance-tuning your Hadoop cluster with Apache Hadoop About This Video Plan, …
Hands-On Big Data Processing with Hadoop 3
Perform real-time data analytics, stream and batch processing on your application using Hadoop About This Video …
Hadoop Fundamentals LiveLessons (Video Training), 2/e
Apache Hadoop is a freely available open source tool-set that enables big data analysis. This Hadoop …