Optimizing Your R Code
Once you figure out where your program is spending its time, you can focus on improving those areas. This section describes some common causes for poor performance and shows how to resolve them.
Using Vector Operations
R is a functional language with built-in support for vector operations. Whenever possible, you should use vector operations in your code and not write iterative algorithms. This section explains why.
Iterative algorithms and vector operations
Let’s consider a simple problem: calculating a vector
with the square of every integer between 1 and n
. Consider the following naive
implementation:
> naive.vector.of.squares <- function(n) { + v <- 1:n + for (i in 1:n) + v[i] <- v[i]^2 + v + } > naive.vector.of.squares(10) [1] 1 4 9 16 25 36 49 64 81 100
How does the performance of this function vary with n
? Let’s do a quick experiment:
> # 10,000 values > system.time(naive.vector.of.squares(10000)) user system elapsed 0.037 0.000 0.037 > # 10,000,000 values > system.time(naive.vector.of.squares(10000000)) user system elapsed 30.211 0.233 30.178
As you can see, the time required to compute the vector varies
linearly with the size of the vector (n
). This makes sense: R is looping through
all n elements in the vector and changing each element one at a time.
(Note that R doesn’t actually copy the vector v repeatedly inside the
loop; see Objects Are Copied in Assignment Statements for more about how this
works.)
It turns out that there is a much better way to implement ...
Get R in a Nutshell, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.