O'Reilly logo

R Packages by Hadley Wickham

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 1. Introduction

In R, the fundamental unit of shareable code is the package. A package bundles together code, data, documentation, and tests, and is easy to share with others. As of January 2015, there were over 6,000 packages available on the Comprehensive R Archive Network, or CRAN, the public clearing house for R packages. This huge variety of packages is one of the reasons that R is so successful: chances are that someone has already solved a problem that you’re working on, and you can benefit from their work by downloading their package.

If you’re reading this book, you already know how to use packages:

  • You install them from CRAN with install.packages("x").
  • You use them in R with library(x).
  • You get help on them with package?x and help(package = "x").

The goal of this book is to teach you how to develop packages so that you can write your own, not just use other people’s. Why write a package? One compelling reason is that you have code that you want to share with others. Bundling your code into a package makes it easy for other people to use it, because like you, they already know how to use packages. If your code is in a package, any R user can easily download it, install it, and learn how to use it.

But packages are useful even if you never share your code. As Hilary Parker says in her introduction to packages: “Seriously, it doesn’t have to be about sharing your code (although that is an added benefit!). It is about saving yourself time.” Organizing code in a package makes your life easier because packages come with conventions. For example, you put R code in R/, you put tests in tests/, and you put data in data/. These conventions are helpful because:

They save you time
Instead of having to think about the best way to organize a project, you can just follow a template.
Standardized conventions lead to standardized tools
If you buy into R’s package conventions, you get many tools for free.

It’s even possible to use packages to structure your data analyses, as Robert M. Flight discusses in a series of blog posts.

Philosophy

This book espouses my philosophy of package development: anything that can be automated should be automated. Do as little as possible by hand. Do as much as possible with functions. The goal is to spend your time thinking about what you want your package to do rather than thinking about the minutiae of package structure.

This philosophy is realized primarily through the devtools package, a suite of R functions that I wrote to automate common development tasks. The goal of devtools is to make package development as painless as possible. It does this by encapsulating all of the best practices of package development that I’ve learned over the years. Devtools protects you from many potential mistakes, so you can focus on the problem you’re interested in, not on developing a package.

Devtools works hand in hand with RStudio, which I believe is the best development environment for most R users. The only real competitor is Emacs Speaks Statistics (ESS), which is a rewarding environment if you’re willing to put in the time to learn Emacs and customize it to your needs. The history of ESS stretches back over 20 years (predating R!), but it’s still actively developed and many of the workflows described in this book are also available there.

Together, devtools and RStudio insulate you from the low-level details of how packages are built. As you start to develop more packages, I highly recommend that you learn more about those details. The best resource for the official details of package development is always the official writing R extensions manual. However, this manual can be hard to understand if you’re not already familiar with the basics of packages. It’s also exhaustive, covering every possible package component, rather than focusing on the most common and useful components, as this book does. Writing R extensions is a useful resource once you’ve mastered the basics and want to learn what’s going on under the hood.

Getting Started

To get started, make sure you have the latest version of R (at least 3.1.2, which is the version that the code in this book uses), then run the following code to get the packages you’ll need:

install.packages(c("devtools", "roxygen2", "testthat", "knitr"))

Make sure you have a recent version of RStudio. You can check that you have the right version by running the following:

install.packages("rstudioapi")
rstudioapi::isAvailable("0.99.149")

If not, you may need to install the preview version. This gives you access to the latest and greatest features, and only slightly increases your chances of finding a bug.

If you want to keep up with the bleeding edge of devtools development, you can use the following code to access new functions as I develop them:

devtools::install_github("hadley/devtools")

You’ll need a C compiler and a few command-line tools. If you’re on Windows or Mac and you don’t already have them, RStudio will install them for you. Otherwise:

  • On Windows, download and install Rtools. Nnote: this is not an R package!

  • On Mac, make sure you have either XCode (available for free in the App Store) or the “Command-Line Tools for Xcode”. You’ll need to have a (free) Apple ID.

  • On Linux, make sure you’ve installed not only R, but also the R development tools. For example, on Ubuntu (and Debian) you need to install the Ubuntu r-base-dev package.

You can check that you have everything installed by running the following code:

library(devtools)
has_devel()
#> '/Library/Frameworks/R.framework/Resources/bin/R' --vanilla CMD SHLIB foo.c 
#> 
#> clang -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG 
#>   -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include
#>   -fPIC  -Wall -mtune=core2 -g -O2  -c foo.c -o foo.o
#> clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup
#>   -single_module -multiply_defined suppress -L/usr/local/lib -o foo.so foo.o 
#>   -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework 
#>   -Wl,CoreFoundation
[1] TRUE

This will print out some code that I use to help diagnose problems. If everything is OK, it will return TRUE. Otherwise, it will throw an error and you’ll need to investigate the problem.

Conventions

Throughout this book I write foo() to refer to functions, bar to refer to variables and function parameters, and baz/ to refer to paths. Larger code blocks intermingle input and output. Output is commented so that if you have an electronic version of the book (e.g., http://r-pkgs.had.co.nz), you can easily copy and paste examples into R. Output comments look like #> to distinguish them from regular comments.

Colophon

This book was written in Rmarkdown inside RStudio. knitr and pandoc converted the raw Rmarkdown to HTML and PDF. The website was made with jekyll, styled with bootstrap, and published to Amazon’s S3 by travis-ci. The complete source is available from GitHub. This version of the book was built with:

library(roxygen2)
library(testthat)
devtools::session_info()
#> Session info --------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.1.2 (2014-10-31)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       <NA>
#> Packages ------------------------------------------------------------------
#>  package    * version    date       source                            
#>  bookdown     0.1        2015-02-12 Github (hadley/bookdown@fde0b07)  
#>  devtools   * 1.7.0.9000 2015-02-12 Github (hadley/devtools@9415a8a)  
#>  digest     * 0.6.8      2014-12-31 CRAN (R 3.1.2)                    
#>  evaluate   * 0.5.5      2014-04-29 CRAN (R 3.1.0)                    
#>  formatR    * 1.0        2014-08-25 CRAN (R 3.1.1)                    
#>  htmltools  * 0.2.6      2014-09-08 CRAN (R 3.1.2)                    
#>  knitr      * 1.9        2015-01-20 CRAN (R 3.1.2)                    
#>  Rcpp       * 0.11.4     2015-01-24 CRAN (R 3.1.2)                    
#>  rmarkdown    0.5.1      2015-02-12 Github (rstudio/rmarkdown@0f19584)
#>  roxygen2     4.1.0      2014-12-13 CRAN (R 3.1.2)                    
#>  rstudioapi * 0.2        2014-12-31 CRAN (R 3.1.2)                    
#>  stringr    * 0.6.2      2012-12-06 CRAN (R 3.0.0)                    
#>  testthat     0.9.1      2014-10-01 CRAN (R 3.1.1)

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required