Chapter 15

Using Open Source R for Data Science

In This Chapter

arrow In This ChapterGrasping the basic R vocabulary and concepts

arrow Previewing popular R packages

arrow Playing with more advanced R packages

R is an open source, free statistical software system that, like Python, has been widely adopted across the data science sector over the last decade. In fact, there is somewhat of a never-ending squabble between data science types about which programming language is actually best suited for data science. Practitioners that favor R generally do so because of its advanced statistical programming and data visualization capabilities — capabilities that simply can’t be replicated in Python. When it comes to data science practitioners, specifically, R’s user base is broader than Python’s. (For more on Python, see Chapter 14; note that the R programming language and the packages that support it are downloadable from http://cran.r-project.org.)

Introducing the Fundamental Concepts

R is not as easy to learn as Python, but it can be more powerful for certain types of advanced statistical analyses. Although R’s learning curve is somewhat steeper than Python’s, the programming language is nonetheless ...

Get Data Science For Dummies now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.