Chapter 8. Integrating R and Hadoop for statistics and more

 

This chapter covers
  • Integrating your R scripts with MapReduce and Streaming
  • Understanding Rhipe, RHadoop, and R + Streaming

 

R is a statistical programming language for performing data analysis and graphing the results. The capabilities of R[1] let you perform statistical and predictive analytics, data mining, and visualization functions on your data. Its breadth of coverage and applicability across a wide range of sectors (such as finance, life sciences, manufacturing, retail, and more) make it a popular tool.

1 R contains built-in as well as user-created packages which can be accessed via CRAN, its package distribution system; see (http://cran.r-project.org/web/packages/

Get Hadoop in Practice now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.