Chapter 8. Integrating R and Hadoop for statistics and more

 

This chapter covers
  • Integrating your R scripts with MapReduce and Streaming
  • Understanding Rhipe, RHadoop, and R + Streaming

 

R is a statistical programming language for performing data analysis and graphing the results. The capabilities of R[1] let you perform statistical and predictive analytics, data mining, and visualization functions on your data. Its breadth of coverage and applicability across a wide range of sectors (such as finance, life sciences, manufacturing, retail, and more) make it a popular tool.

1 R contains built-in as well as user-created packages which can be accessed via CRAN, its package distribution system; see (http://cran.r-project.org/web/packages/

Get Hadoop in Practice now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.