March 2018
Beginner to intermediate
570 pages
13h 42m
English
We began this chapter by explaining some of the reasons why large datasets sometimes present a problem for unoptimized R code, such as no auto-parallelization and no native support for out-of-memory data. For the rest of the chapter, we discussed specific routes to optimizing R code in order to tackle large data.
First, you learned of the dangers of optimizing code too early. Next, we saw (much to the relief of slackers everywhere) that taking the lazy way out (and buying or renting a more powerful machine) is often the more cost-effective solution.
After that, we saw that a little knowledge about the dynamics of memory allocation and vectorization in R can often go a long way in performance gains.
The next two sections focused less ...