Chapter 1. Getting Started and Getting Help

This chapter sets the groundwork for the other chapters. It explains how to download, install, and run R.

More importantly, it also explains how to get answers to your questions. The R community provides a wealth of documentation and assistance. You are not alone. Here are some common sources of help:

Local, installed documentation

When you install R on your computer, a mass of documentation is also installed. You can browse the local documentation (Recipe 1.7) and search it (Recipe 1.9). We are amazed how often we search the web for an answer only to discover it was already available in the installed documentation.

Task views

A task view describes packages that are specific to one area of statistical work, such as econometrics, medical imaging, psychometrics, or spatial statistics. Each task view is written and maintained by an expert in the field. There are more than 35 such task views, so there is likely to be one or more for your areas of interest. We recommend that every beginner find and read at least one task view in order to gain a sense of R’s possibilities (Recipe 1.12).

Package documentation

Most packages include useful documentation. Many also include overviews and tutorials, called vignettes in the R community. The documentation is kept with the packages in package repositories such as CRAN, and it is automatically installed on your machine when you install a package.

Question and answer (Q&A) websites

On a Q&A site, anyone can post a question, and knowledgeable people can respond. Readers vote on the answers, so the best answers tend to emerge over time. All this information is tagged and archived for searching. These sites are a cross between a mailing list and a social network; Stack Overflow is the canonical example.

The web

The web is loaded with information about R, and there are R-specific tools for searching it (Recipe 1.11). The web is a moving target, so be on the lookout for new, improved ways to organize and search information regarding R.

Mailing lists

Volunteers have generously donated many hours of time to answer beginners’ questions that are posted to the R mailing lists. The lists are archived, so you can search the archives for answers to your questions (Recipe 1.13).

1.1 Downloading and Installing R

Problem

You want to install R on your computer.

Solution

Windows and macOS users can download R from CRAN, the Comprehensive R Archive Network. Linux and Unix users can install R packages using their package management tool.

Windows

  1. Open http://www.r-project.org/ in your browser.

  2. Click on “CRAN.” You’ll see a list of mirror sites, organized by country.

  3. Select a site near you or the top one listed as “0-Cloud,” which tends to work well for most locations (https://cloud.r-project.org/).

  4. Click on “Download R for Windows” under “Download and Install R.”

  5. Click on “base.”

  6. Click on the link for downloading the latest version of R (an .exe file).

  7. When the download completes, double-click on the .exe file and answer the usual questions.

macOS

  1. Open http://www.r-project.org/ in your browser.

  2. Click on “CRAN.” You’ll see a list of mirror sites, organized by country.

  3. Select a site near you or the top one listed as “0-Cloud,” which tends to work well for most locations.

  4. Click on “Download R for (Mac) OS X.”

  5. Click on the .pkg file for the latest version of R, under “Latest release:,” to download it.

  6. When the download completes, double-click on the .pkg file and answer the usual questions.

Linux or Unix

The major Linux distributions have packages for installing R. Table 1-1 shows some examples.

Table 1-1. Linux distributions
Distribution Package name

Ubuntu or Debian

r-base

Red Hat or Fedora

R.i386

SUSE

R-base

Use the system’s package manager to download and install the package. Normally, you will need the root password or sudo privileges; otherwise, ask a system administrator to perform the installation.

Discussion

Installing R on Windows or macOS is straightforward because there are prebuilt binaries (compiled programs) for those platforms. You need only follow the preceding instructions. The CRAN web pages also contain links to installation-related resources, such as frequently asked questions (FAQs) and tips for special situations (“Does R run under Windows Vista/7/8/Server 2008?”), that you may find useful.

The best way to install R on Linux or Unix is by using your Linux distribution package manager to install R as a package. The distribution packages greatly streamline both the initial installation and subsequent updates.

On Ubuntu or Debian, use apt-get to download and install R. Run under sudo to have the necessary privileges:

$ sudo apt-get install r-base

On Red Hat or Fedora, use yum:

$ sudo yum install R.i386

Most Linux platforms also have graphical package managers, which you might find more convenient.

Beyond the base packages, we recommend installing the documentation packages, too. We like to install r-base-html (because we like browsing the hyperlinked documentation) as well as r-doc-html, which installs the important R manuals locally:

$ sudo apt-get install r-base-html r-doc-html

Some Linux repositories also include prebuilt copies of R packages available on CRAN. We don’t use them because we’d rather get software directly from CRAN itself, which usually has the freshest versions.

In rare cases, you may need to build R from scratch. You might have an obscure, unsupported version of Unix, or you might have special considerations regarding performance or configuration. The build procedure on Linux or Unix is quite standard. Download the tarball from the home page of your CRAN mirror; it’ll be called something like R-3.5.1.tar.gz, except the 3.5.1 will be replaced by the latest version. Unpack the tarball, look for a file called INSTALL, and follow the directions.

See Also

R in a Nutshell by Joseph Adler (O’Reilly) contains more details on downloading and installing R, including instructions for building the Windows and macOS versions. Perhaps the ultimate guide is the one entitled “R Installation and Administration”, available on CRAN, which describes building and installing R on a variety of platforms.

This recipe is about installing the base package. See Recipe 3.10 for installing add-on packages from CRAN.

1.2 Installing RStudio

Problem

You want a more comprehensive integrated development environment (IDE) than the R default. In other words, you want to install RStudio Desktop.

Solution

Over the past few years RStudio has become the most widely used IDE for R. We are of the opinion that almost all R work should be done in the RStudio Desktop IDE, unless there is a compelling reason to do otherwise. RStudio makes multiple products, including RStudio Desktop, RStudio Server, and RStudio Shiny Server, just to name a few. For this book we will use the term RStudio to mean RStudio Desktop, though most concepts apply to RStudio Server as well.

To install RStudio, download the latest installer for your platform from the RStudio website.

The RStudio Desktop Open Source License version is free to download and use.

Discussion

This book was written and built using RStudio version 1.2.x and R versions 3.5.x. New versions of RStudio are released every few months, so be sure to update regularly. Note that RStudio works with whichever version of R you have installed, so updating to the latest version of RStudio does not upgrade your version of R. R must be upgraded separately.

Interacting with R is slightly different in RStudio than in the built-in R user interface. For this book, we’ve elected to use RStudio for all examples.

1.3 Starting RStudio

Problem

You want to run RStudio on your computer.

Solution

A common mistake made by new users of R and RStudio is to accidentally start R when they intended to start RStudio. The easiest way to ensure you’re actually starting RStudio is to search for RStudio on your desktop, then use whatever method your OS provides for pinning the icon somewhere easy to find later:

Windows

Click on the Start Screen menu in the lower-left corner of the screen. In the search box, type RStudio.

macOS

Look in your launchpad for the RStudio app or press Cmd-space (Cmd is the command or ⌘ key) and type RStudio to search using Spotlight Search.

Ubuntu

Press Alt-F1 and type RStudio to search for RStudio.

Discussion

It’s easy to get confused between R and RStudio because, as you can see in Figure 1-1, the icons look similar.

rcbk 0101
Figure 1-1. R and RStudio icons in macOS

If you click on the R icon, you’ll be greeted by something like Figure 1-2, which is the Base R interface on a Mac, but certainly not RStudio.

rcbk 0102
Figure 1-2. The R console in macOS

When you start RStudio, by default it will reopen the last project you were working on in RStudio.

1.4 Entering Commands

Problem

You’ve started RStudio. Now what?

Solution

When you start RStudio, the main window on the left is an R session. From there you can enter commands interactively directly to R.

Discussion

R prompts you with >. To get started, just treat R like a big calculator: enter an expression, and R will evaluate the expression and print the result:

rcbk 01in01

The computer adds 1 and 1, and displays the result, 2.

The [1] before the 2 might be confusing. To R, the result is a vector, even though it has only one element. R labels the value with [1] to signify that this is the first element of the vector… which is not surprising, since it’s the only element of the vector.

R will prompt you for input until you type a complete expression. The expression max(1,3,5) is a complete expression, so R stops reading input and evaluates what it’s got:

rcbk 01in02

In contrast, max(1,3, is an incomplete expression, so R prompts you for more input. The prompt changes from greater-than (>) to plus (+), letting you know that R expects more:

rcbk 01in03

It’s easy to mistype commands, and retyping them is tedious and frustrating. So R includes command-line editing to make life easier. It defines single keystrokes that let you easily recall, correct, and reexecute your commands. A typical command-line interaction goes like this:

  1. You enter an R expression with a typo.

  2. R complains about your mistake.

  3. You press the up arrow key to recall your mistaken line.

  4. You use the left and right arrow keys to move the cursor back to the error.

  5. You use the Delete key to delete the offending characters.

  6. You type the corrected characters, which inserts them into the command line.

  7. You press Enter to reexecute the corrected command.

That’s just the basics. R supports the usual keystrokes for recalling and editing command lines, as listed in Table 1-2.

Table 1-2. R command shortcuts
Labeled key Ctrl-key combo Effect

Up arrow

Ctrl-P

Recall previous command by moving backward through the history of commands.

Down arrow

Ctrl-N

Move forward through the history of commands.

Backspace

Ctrl-H

Delete the character to the left of the cursor.

Delete (Del)

Ctrl-D

Delete the character to the right of the cursor.

Home

Ctrl-A

Move the cursor to the start of the line.

End

Ctrl-E

Move the cursor to the end of the line.

Right arrow

Ctrl-F

Move the cursor right (forward) one character.

Left arrow

Ctrl-B

Move the cursor left (back) one character.

Ctrl-K

Delete everything from the cursor position to the end of the line.

Ctrl-U

Clear the whole darn line and start over.

Tab

Complete the name (on some platforms).

On most operating systems, you can also use the mouse to highlight commands and then use the usual copy and paste commands to paste text into a new command line.

See Also

See Recipe 2.12. From the Windows main menu, follow Help → Console for a complete list of keystrokes useful for command-line editing.

1.5 Exiting from RStudio

Problem

You want to exit from RStudio.

Solution

Windows and most Linux distributions

Select File → Quit Session from the main menu, or click on the X in the upper-right corner of the window frame.

macOS

Select File → Quit Session from the main menu, or press Cmd-Q, or click on the red circle in the upper-left corner of the window frame.

On all platforms, you can also use the q function (as in quit) to terminate R and RStudio:

q()

Note the empty parentheses, which are necessary to call the function.

Discussion

Whenever you exit, R typically asks if you want to save your workspace. You have three choices:

  • Save your workspace and exit.

  • Don’t save your workspace, but exit anyway.

  • Cancel, returning to the command prompt rather than exiting.

If you save your workspace, R writes it to a file called .RData in the current working directory. Saving the workspace saves any R objects you have created. The next time you start R in the same directory, the workspace will automatically load. Saving your workspace will overwrite the previously saved workspace, if any, so don’t save if you don’t like your changes (e.g., if you have accidentally erased critical data from your workspace).

We recommend never saving your workspace when you exit and instead always explicitly saving your project, scripts, and data. We also recommend that you turn off the prompt to save and autorestore the workspace in RStudio using the global options found in the menu Tools → Global Options and shown in Figure 1-3. This way, when you exit R and RStudio, you won’t be prompted to save your workspace. But keep in mind that any objects created but not saved to disk will be lost!

rcbk 0103
Figure 1-3. Save workspace options

See Also

See Recipe 3.1 for more about the current working directory and Recipe 3.3 for more about saving your workspace. Also see Chapter 2 of R in a Nutshell.

1.6 Interrupting R

Problem

You want to interrupt a long-running computation and return to the command prompt without exiting RStudio.

Solution

Press the Esc key on your keyboard, or click on the Session menu in RStudio and select “Interrupt R.” You may also click on the stop sign icon in the code console window.

Discussion

Interrupting R means telling R to stop running the current command, but without deleting variables from memory or completely closing RStudio. That said, interrupting R can leave your variables in an indeterminate state, depending upon how far the computation had progressed, so check your workspace after interrupting.

See Also

See Recipe 1.5.

1.7 Viewing the Supplied Documentation

Problem

You want to read the documentation supplied with R.

Solution

Use the help.start function to see the documentation’s table of contents:

help.start()

From there, links are available to all the installed documentation. In RStudio the help will show up in the help pane, which by default is on the righthand side of the screen.

In RStudio you can also click Help → R Help to get a listing with help options for both R and RStudio.

Discussion

The base distribution of R includes a wealth of documentation—literally thousands of pages. When you install additional packages, those packages contain documentation that is also installed on your machine.

It is easy to browse this documentation via the help.start function, which opens on the top-level table of contents. Figure 1-4 shows how help.start appears inside the help pane in RStudio.

rcbk 0104
Figure 1-4. RStudio help.start

The two links in the Reference section are especially useful:

Packages

Click here to see a list of all the installed packages—both the base packages and the additional installed packages. Click on a package name to see a list of its functions and datasets.

Search Engine & Keywords

Click here to access a simple search engine that allows you to search the documentation by keyword or phrase. There is also a list of common keywords, organized by topic; click one to see the associated pages.

The Base R documentation accessed via help.start is loaded on your computer when you install R. The RStudio help, which you access by using the menu option Help → R Help, presents a page with links to RStudio’s website. So, you will need internet access to access the RStudio help links.

See Also

The local documentation is copied from the R Project website, which may have updated documents.

1.8 Getting Help on a Function

Problem

You want to know more about a function that is installed on your machine.

Solution

Use help to display the documentation for the function:

help(functionname)

Use args for a quick reminder of the function arguments:

args(functionname)

Use example to see examples of using the function:

example(functionname)

Discussion

We present many R functions in this book. Every R function has more bells and whistles than we can possibly describe. If a function catches your interest, we strongly suggest reading the help page for that function. One of its bells or whistles might be very useful to you.

Suppose you want to know more about the mean function. Use the help function like this:

help(mean)

This will open the help page for the mean function in the help pane in RStudio. A shortcut for the help command is to simply type ? followed by the function name:

?mean

Sometimes you just want a quick reminder of the arguments to a function: what are they, and in what order do they occur? For this case, use the args function:

args(mean)
#> function (x, ...)
#> NULL
args(sd)
#> function (x, na.rm = FALSE)
#> NULL

The first line of output from args is a synopsis of the function call. For mean, the synopsis shows one argument, x, which is a vector of numbers. For sd, the synopsis shows the same vector, x, and an optional argument called na.rm. (You can ignore the second line of output, which is often just NULL.) In RStudio you will see the args output as a floating tool tip over your cursor when you type a function name, as shown in Figure 1-5.

rcbk 0105
Figure 1-5. RStudio tool tip

Most documentation for functions includes example code near the end of the document. A cool feature of R is that you can request that it execute the examples, giving you a little demonstration of the function’s capabilities. The documentation for the mean function, for instance, contains examples, but you don’t need to type them yourself. Just use the example function to watch them run:

example(mean)
#>
#> mean> x <- c(0:10, 50)
#>
#> mean> xm <- mean(x)
#>
#> mean> c(xm, mean(x, trim = 0.10))
#> [1] 8.75 5.50

Everything you see after example(mean) was produced by R, which executed the examples from the help page and displayed the results.

See Also

See Recipe 1.9 for searching for functions and Recipe 3.6 for more about the search path.

1.9 Searching the Supplied Documentation

Problem

You want to know more about a function that is installed on your machine, but the help function reports that it cannot find documentation for any such function.

Alternatively, you want to search the installed documentation for a keyword.

Solution

Use help.search to search the R documentation on your computer:

help.search("pattern")

A typical pattern is a function name or keyword. Notice that it must be enclosed in quotation marks.

For your convenience, you can also invoke a search by using two question marks (in which case the quotes are not required). Note that searching for a function by name uses one question mark, while searching for a text pattern uses two:

> ??pattern

Discussion

You may occasionally request help on a function only to be told R knows nothing about it:

help(adf.test)
#> No documentation for 'adf.test' in specified packages and libraries:
#> you could try '??adf.test'

This can be frustrating if you know the function is installed on your machine. Here the problem is that the function’s package is not currently loaded, and you don’t know which package contains the function. It’s kind of a catch-22 (the error message indicates the package is not currently in your search path, so R cannot find the help file; see Recipe 3.6 for more details).

The solution is to search all your installed packages for the function. Just use the help.search function, as suggested in the error message:

help.search("adf.test")

The search will produce a listing of all packages that contain the function:

Help files with alias or concept or title matching 'adf.test' using
regular expression matching:

tseries::adf.test       Augmented Dickey-Fuller Test

Type '?PKG::FOO' to inspect entry 'PKG::FOO TITLE'.

The preceding output indicates that the tseries package contains the adf.test function. You can see its documentation by explicitly telling help which package contains the function:

help(adf.test, package = "tseries")

or you can use the double colon operator to tell R to look in a specific package:

?tseries::adf.test

You can broaden your search by using keywords. R will then find any installed documentation that contains the keywords. Suppose you want to find all functions that mention the Augmented Dickey–Fuller (ADF) test. You could search on a likely pattern:

help.search("dickey-fuller")

See Also

You can also access the local search engine through the documentation browser; see Recipe 1.7 for how this is done. See Recipe 3.6 for more about the search path and Recipe 1.8 for getting help on functions.

1.10 Getting Help on a Package

Problem

You want to learn more about a package installed on your computer.

Solution

Use the help function and specify a package name (without a function name):

help(package = "packagename")

Discussion

Sometimes you want to know the contents of a package (the functions and datasets). This is especially true after you download and install a new package, for example. The help function can provide the contents plus other information once you specify the package name.

This call to help would display the information for the tseries package, a standard package in the base distribution (try it!):

help(package = "tseries")

The information begins with a description and continues with an index of functions and datasets. In RStudio, the HTML-formatted help page will open in the help window of the IDE.

Some packages also include vignettes, which are additional documents such as introductions, tutorials, or reference cards. They are installed on your computer as part of the package documentation when you install the package. The help page for a package includes a list of its vignettes near the bottom.

You can see a list of all vignettes on your computer by using the vignette function:

vignette()

In RStudio this will open a new tab listing every package installed on your computer that includes vignettes as well as the vignette names and descriptions.

You can see the vignettes for a particular package by including its name:

vignette(package = "packagename")

Each vignette has a name, which you use to view the vignette:

vignette("vignettename")

See Also

See Recipe 1.8 for getting help on a particular function in a package.

1.11 Searching the Web for Help

Problem

You want to search the web for information and answers regarding R.

Solution

Inside R, use the RSiteSearch function to search by keyword or phrase:

RSiteSearch("key phrase")

Inside your browser, try using these sites for searching:

RSeek

This is a Google custom search engine that is focused on R-specific websites.

Stack Overflow

Stack Overflow is a searchable Q&A site from Stack Exchange that is oriented toward programming issues such as data structures, coding, and graphics. Stack Overflow is a great “first stop” for all your syntax questions.

Cross Validated

Cross Validated is a Stack Exchange site focused on statistics, machine learning, and data analysis rather than programming. It’s a good place for questions about what statistical method to use.

RStudio Community

The RStudio Community site is a discussion forum hosted by RStudio. The topics include R, RStudio, and associated technology. Being an RStudio site, this forum is often visited by RStudio staff and those who use the software frequently. This is a good place for general questions and questions that possibly don’t fit as well into the Stack Overflow syntax-focused format.

Discussion

The RSiteSearch function will open a browser window and direct it to the search engine on the R Project website. There you will see an initial search that you can refine. For example, this call would start a search for “canonical correlation”:

RSiteSearch("canonical correlation")

This is quite handy for doing quick web searches without leaving R. However, the search scope is limited to R documentation and the mailing list archives.

RSeek provides a wider search. Its virtue is that it harnesses the power of the Google search engine while focusing on sites relevant to R. That eliminates the extraneous results of a generic Google search. The beauty of RSeek is that it organizes the results in a useful way.

Figure 1-6 shows the results of visiting RSeek and searching for “correlation.” Note that the tabs across the top allow for drilling in to different types of content:

  • All results

  • Packages

  • Books

  • Support

  • Articles

  • For Beginners

rcbk 0106
Figure 1-6. RSeek

Stack Overflow is a Q&A site, which means that anyone can submit a question and experienced users will supply answers—often there are multiple answers to each question. Readers vote on the answers, so good answers tend to rise to the top. This creates a rich database of Q&A dialogues, which you can search. Stack Overflow is strongly problem-oriented, and the topics lean toward the programming side of R.

Stack Overflow hosts questions for many programming languages; therefore, when entering a term into its search box, prefix it with “[r]” to focus the search on questions tagged for R. For example, searching for “[r] standard error” will select only the questions tagged for R and will avoid the Python and C++ questions.

Stack Overflow also includes a wiki about the R language that provides an excellent community-curated list of online R resources.

Stack Exchange (the parent company of Stack Overflow) has a Q&A area for statistical analysis called Cross Validated. This area is more focused on statistics than programming, so use it when seeking answers that are more concerned with statistics in general and less with R in particular.

RStudio hosts its own discussion board as well. This is a great place to ask general questions and more conceptual questions that may not work as well on Stack Overflow.

See Also

If your search reveals a useful package, use Recipe 3.10 to install it on your machine.

1.12 Finding Relevant Functions and Packages

Problem

Of the 10,000+ packages for R, you have no idea which ones would be useful to you.

Solution

  • To discover packages related to a certain field, visit CRAN’s list of task views. Select the task view for your area, which will give you links to and descriptions of relevant packages. Or visit RSeek, search by keyword, click on the Task Views tab, and select an applicable task view.

  • Visit crantastic and search for packages by keyword.

  • To find relevant functions, visit RSeek, search by name or keyword, and click on the Functions tab.

Discussion

This problem is especially vexing for beginners. You think R can solve your problems, but you have no idea which packages and functions would be useful. A common question on the mailing lists is: “Is there a package to solve problem X?” That is the silent scream of someone drowning in R.

As of this writing, there are more than 10,000 packages available for free download from CRAN. Each package has a summary page with a short description and links to the package documentation. Once you’ve located a potentially interesting package, you would typically click on the “Reference manual” link to view the PDF documentation with full details. (The summary page also contains download links for installing the package, but you’ll rarely install the package that way; see Recipe 3.10.)

Sometimes you simply have a generic interest—such as Bayesian analysis, econometrics, optimization, or graphics. CRAN contains a set of task view pages describing packages that may be useful. A task view is a great place to start since you get an overview of what’s available. You can see the list of task view pages at CRAN Task Views or search for them as described in the Solution. CRAN’s Task Views lists a number of broad fields and shows packages that are used in each field. For example, there are task views for high-performance computing, genetics, time series, and social science, just to name a few.

Suppose you happen to know the name of a useful package—say, by seeing it mentioned online. A complete alphabetical list of packages is available at CRAN with links to the package summary pages.

See Also

You can download and install an R package called sos that provides powerful other ways to search for packages; see the vignette at SOS.

1.13 Searching the Mailing Lists

Problem

You have a question, and you want to search the archives of the mailing lists to see whether your question was answered previously.

Solution

Open Nabble in your browser. Search for a keyword or other search term from your question. This will show results from the support mailing lists.

Discussion

This recipe is really just an application of Recipe 1.11. But it’s an important application, because you should search the mailing list archives before submitting a new question to the list. Your question has probably been answered before.

See Also

CRAN has a list of additional resources for searching the web; see CRAN Search.

1.14 Submitting Questions to Stack Overflow or Elsewhere in the Community

Problem

You have a question you can’t find the answer to online, so you want to submit a question to the R community.

Solution

The first step to asking a question online is to create a reproducible example. Having example code that someone can run and see your exact problem is the most critical part of asking for help online. A question with a good reproducible example has three components:

Example data

This can be simulated data or some real data that you provide.

Example code

This code shows what you have tried or an error you are getting.

Written description

This is where you explain what you have, what you’d like to have, and what you have tried that didn’t work.

The details of writing a reproducible example are covered in the Discussion. Once you have a reproducible example, you can post your question on Stack Overflow. Be sure to include the r tag in the Tags section of the ask page.

If your question is more general or related to concepts instead of specific syntax, RStudio runs an RStudio Community discussion forum. Note that the site is broken into multiple topics, so pick the topic category that best fits your question.

Or you may submit your question to the R mailing lists (but don’t submit to multiple sites, the mailing lists, and Stack Overflow, as that’s considered rude cross-posting).

The mailing lists page contains general information and instructions for using the R-help mailing list. Here is the general process:

  1. Subscribe to the main R mailing list, R-help.

  2. Write your question carefully and correctly and include your reproducible example.

  3. Mail your question to r-help@r-project.org.

Discussion

The R-help mailing list, Stack Overflow, and the RStudio Community site are great resources, but please treat them as a last resort. Read the help pages, read the documentation, search the help list archives, and search the web. It is most likely that your question has already been answered. Don’t kid yourself: very few questions are unique. If you’ve exhausted all other options, though, maybe it’s time to create a good question.

The reproducible example is the crux of a good help request. The first component is example data. A good way to get this is to simulate the data using a few R functions. The following example creates a data frame called example_df that has three columns, each of a different data type:

set.seed(42)
n <- 4
example_df <- data.frame(
  some_reals = rnorm(n),
  some_letters = sample(LETTERS, n, replace = TRUE),
  some_ints = sample(1:10, n, replace = TRUE)
)
example_df
#>   some_reals some_letters some_ints
#> 1      1.371            R        10
#> 2     -0.565            S         3
#> 3      0.363            L         5
#> 4      0.633            S        10

Note that this example uses the command set.seed at the beginning. This ensures that every time this code is run, the answers will be the same. The n value is the number of rows of example data you would like to create. Make your example data as simple as possible to illustrate your question.

An alternative to creating simulated data is to use example data that comes with R. For example, the dataset mtcars contains a data frame with 32 records about different car models:

data(mtcars)
head(mtcars)
#>                    mpg cyl disp  hp drat   wt qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.62 16.5  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.88 17.0  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.32 18.6  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.21 19.4  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.44 17.0  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.46 20.2  1  0    3    1

If your example is reproducible only with your own data, you can use dput to put a bit of your own data in a string that you can use in your example. We’ll illustrate that approach using two rows from the mtcars dataset:

dput(head(mtcars, 2))
#> structure(list(mpg = c(21, 21), cyl = c(6, 6), disp = c(160,
#> 160), hp = c(110, 110), drat = c(3.9, 3.9), wt = c(2.62, 2.875
#> ), qsec = c(16.46, 17.02), vs = c(0, 0), am = c(1, 1), gear = c(4,
#> 4), carb = c(4, 4)), row.names = c("Mazda RX4", "Mazda RX4 Wag"
#> ), class = "data.frame")

You can put the resulting structure directly in your question:

example_df <- structure(list(mpg = c(21, 21), cyl = c(6, 6), disp = c(160,
160), hp = c(110, 110), drat = c(3.9, 3.9), wt = c(2.62, 2.875
), qsec = c(16.46, 17.02), vs = c(0, 0), am = c(1, 1), gear = c(4,
4), carb = c(4, 4)), row.names = c("Mazda RX4", "Mazda RX4 Wag"
), class = "data.frame")

example_df
#>               mpg cyl disp  hp drat   wt qsec vs am gear carb
#> Mazda RX4      21   6  160 110  3.9 2.62 16.5  0  1    4    4
#> Mazda RX4 Wag  21   6  160 110  3.9 2.88 17.0  0  1    4    4

The second part of a good reproducible example is the example code. The code example should be as simple as possible and illustrate what you are trying to do or have already tried. It should not be a big block of code with many different things going on. Boil your example down to only the minimal amount of code needed. If you use any packages, be sure to include the library call at the beginning of your code. Also, don’t include anything in your question that is potentially harmful to someone running your code, such as rm(list=ls()), which would delete all R objects in memory. Have empathy for the person trying to help you, and realize that they are volunteering their time to help you out and may run your code on the same machine they use to do their own work.

To test your example, open a new R session and try running it. Once you’ve edited your code, it’s time to give just a bit more information to your potential respondents. In plain text, describe what you were trying to do, what you’ve tried, and your question. Be as concise as possible. As with the example code, your objective is to communicate as efficiently as possible with the person reading your question. You may find it helpful to include in your description which version of R you are running as well as which platform (Windows, Mac, Linux). You can get that information easily with the sessionInfo command.

If you are going to submit your question to the R mailing list, you should know there are actually several mailing lists. R-help is the main list for general questions. There are also many special interest group (SIG) mailing lists dedicated to particular domains such as genetics, finance, R development, and even R jobs. You can see the full list at https://stat.ethz.ch/mailman/listinfo. If your question is specific to a domain, you’ll get a better answer by selecting the appropriate list. As with R-help, however, carefully search the SIG list archives before submitting your question.

See Also

We suggest that you read Eric Raymond and Rick Moen’s excellent essay entitled “How to Ask Questions the Smart Way” before submitting any question. Seriously. Read it.

Stack Overflow has an excellent post that includes details about creating a reproducible example. You can find that at https://stackoverflow.com/q/5963269/37751.

Jenny Bryan has a great R package called reprex that helps in the creation of a good reproducible example and provides helper functions for writing the markdown text for sites like Stack Overflow. You can find that package on her GitHub page.

Get R Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.