Chapter 4. Hadoop and MapReduce Framework for R
In this chapter we are entering the diverse world of Big Data tools and applications that can be relatively easily integrated with the R language. In this chapter, we will present you with a set of guidelines and tips on the following topics:
- Deploying cloud-based virtual machines with Hadoop, the ready-to-use Hadoop Distributed File System (HDFS), and MapReduce frameworks
- Configuring your instance/virtual machine to include essential libraries and useful supplementary tools for data management in HDFS
- Managing HDFS using shell/Terminal commands and running a simple MapReduce word count in Java for comparison
- Integrating R statistical environment with Hadoop on a single-node cluster
- Managing files in ...
Get Big Data Analytics with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.