O'Reilly logo

R Packages by Hadley Wickham

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Preface

In This Book

This book will guide you from being a user of R packages to being a creator of R packages. In Chapter 1, Introduction, you’ll learn why mastering this skill is so important, and why it’s easier than you think. Next, you’ll learn about the basic structure of a package, and the forms it can take, in Chapter 2, Package Structure. The subsequent chapters go into more detail about each component. They’re roughly organized in order of importance:

Chapter 3, R code
The most important directory is R/, where your R code lives. A package with just this directory is still a useful package. (And indeed, if you stop reading the book after this chapter, you’ll have still learned some useful new skills.)
Chapter 4, Package Metadata
The DESCRIPTION lets you describe what your package needs to work. If you’re sharing your package, you’ll also use the DESCRIPTION to describe what it does, who can use it (the license), and who to contact if things go wrong.
Chapter 5, Object Documentation
If you want other people (including “future you”!) to understand how to use the functions in your package, you’ll need to document them. I’ll show you how to use roxygen2 to document your functions. I recommend roxygen2 because it lets you write code and documentation together while continuing to produce R’s standard documentation format.
Chapter 6, Vignettes: Long-Form Documentation
Function documentation describes the nitpicky details of every function in your package. Vignettes give the big picture. They’re long-form documents that show how to combine multiple parts of your package to solve real problems. I’ll show you how to use Rmarkdown and knitr to create vignettes with a minimum of fuss.
Chapter 7, Testing
To ensure your package works as designed (and continues to work as you make changes), it’s essential to write unit tests that define correct behavior, and alert you when functions break. In this chapter, I’ll teach you how to use the testthat package to convert the informal interactive tests that you’re already doing to formal, automated tests.
Chapter 8, Namespace
To play nicely with others, your package needs to define what functions it makes available to other packages and what functions it requires from other packages. This is the job of the NAMESPACE file and I’ll show you how to use roxygen2 to generate it for you. NAMESPACE is one of the more challenging parts of developing an R package, but it’s critical to master if you want your package to work reliably.
Chapter 9, External Data
The data/ directory allows you to include data with your package. You might do this to bundle data in a way that’s easy for R users to access, or just to provide compelling examples in your documentation.
Chapter 10, Compiled Code
R code is designed for human efficiency, not computer efficiency, so it’s useful to have a tool in your back pocket that allows you to write fast code. The src/ directory allows you to include speedy compiled C and C++ code to solve performance bottlenecks in your package.
Chapter 11, Installed Files
You can include arbitrary extra files in the inst/ directory. This is most commonly used for extra information about how to cite your package, and to provide more details about copyrights and licenses.
Chapter 12, Other Components
This chapter documents the handful of other components that are rarely needed: demo/, exec/, po/, and tools/.

The final three chapters describe general best practices not specifically tied to one directory:

Chapter 13, Git and GitHub
Mastering a version control system is vital for collaborating with others, and is useful even for solo work because it allows you to easily undo mistakes. In this chapter, you’ll learn how to use the popular Git and GitHub combo with RStudio.
Chapter 14, Automated Checking
R provides useful automated quality checks in the form of R CMD check. Running them regularly is a great way to avoid many common mistakes. The results can sometimes be a bit cryptic, so I provide a comprehensive cheat sheet to help you convert warnings to actionable insight.
Chapter 15, Releasing a Package
 The life cycle of a package culminates with release to the public. This chapter compares the two main options (CRAN and GitHub) and offers general advice on managing the process.

This is a lot to learn, but don’t feel overwhelmed. Start with a minimal subset of useful features (e.g., just an R/ directory!) and build up over time. To paraphrase the Zen monk Shunryū Suzuki: “Each package is perfect the way it is—and it can use a little improvement.”

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples, exercises, etc.) is available for download at http://r-pkgs.had.co.nz/.

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “R Packages by Hadley Wickham (O’Reilly). Copyright 2015 Hadley Wickham, 978-1-491-91059-7.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

Safari® Books Online

Safari Books Online is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business.

Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.

Safari Books Online offers a range of plans and pricing for enterprise, government, education, and individuals.

Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/r-packages.

To comment or ask technical questions about this book, send email to .

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

Acknowledgments

The tools in this book wouldn’t be possible without many open source contributors. Winston Chang, my coauthor on devtools, spent hours debugging painful S4 and compiled code problems so that devtools can quickly reload code for the vast majority of packages. Kirill Müller contributed great patches to many of my package development packages including devtools, testthat, and roxygen2. Kevin UsheyJJ Allaire, and Dirk Eddelbuettel tirelessly answered all my basic C, C++, and Rcpp questions. Peter Danenburg and Manuel Eugster wrote the first version of roxygen2 during a Google Summer of Code. Craig Citro wrote much of the code to allow travis to work with R packages.

Often the only way I learn how to do it the right way is by doing it the wrong way first. For suffering many package development errors, I’d like to thank all the CRAN maintainers, especially Brian Ripley, Uwe Ligges, and Kurt Hornik.

This book was written in the open and it is truly a community effort: many people read drafts, fixed typos, suggested improvements, and contributed content. Without those contributors, the book wouldn’t be nearly as good as it is, and I’m deeply grateful for their help. A special thanks goes to Peter Li, who read the book from cover to cover and provided many fixes. I also deeply appreciate the time the reviewers (Duncan MurdochKarthik RamVitalie Spinu, and Ramnath Vaidyanathan) spent reading the book and giving me thorough feedback.

Thanks go to all contributors who submitted improvements via GitHub (in alphabetical order): @aaronwolen, @adessy, Adrien Todeschini, Andrea Cantieni, Andy Visser, @apomatix, Ben Bond-Lamberty, Ben Marwick, Brett K, Brett Klamer, @contravariant, Craig Citro, David Robinson, David Smith, @davidkane9, Dean Attali, Eduardo Ariño de la Rubia, Federico Marini, Gerhard Nachtmann, Gerrit-Jan Schutten, Hadley Wickham, Henrik Bengtsson, @heogden, Ian Gow, @jacobbien, Jennifer (Jenny) Bryan, Jim Hester, @jmarshallnz, Jo-Anne Tan, Joanna Zhao, Joe Cainey, John Blischak, @jowalski, Justin Alford, Karl Broman, Karthik Ram, Kevin Ushey, Kun Ren, @kwenzig, @kylelundstedt, @lancelote, Lech Madeyski, @lindbrook, @maiermarco, Manuel Reif, Michael Buckley, @MikeLeonard, Nick Carchedi, Oliver Keyes, Patrick Kimes, Paul Blischak, Peter Meissner, @PeterDee, Po Su, R. Mark Sharp, Richard M. Smith, @rmar073, @rmsharp, Robert Krzyzanowski, @ryanatanner, Sascha Holzhauer, @scharne, Sean Wilkinson, @SimonPBiggs, Stefan Widgren, Stephen Frank, Stephen Rushe, Tony Breyal, Tony Fischetti, @urmils, Vlad Petyuk, Winston Chang, @winterschlaefer, @wrathematics, and @zhaoy.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required