Chapter 9. New and Upcoming

A perfect world would let us stop time to research and write, since a technical book covers a moving target. We didn’t have such a luxury, so instead we set aside some space to pick up on some new arrivals.

This chapter mentions a few tools for which we could have provided more coverage, had we been willing to postpone the book’s release date. Think of this as a look into one possible future of R parallelism. Special thanks to our colleagues, reviewers, and friends who so kindly brought these to our attention.

doRedis

The foreach() function[62] executes an arbitrary R expression across an input. foreach()’s strength is that it can execute in parallel with the help of a supplied parallel backend. The doRedis package provides such a backend, using the Redis datastore[63] as a job queue.

doRedis can work locally to take advantage of multicore systems, and also farm tasks out to remote R instances (“workers”). It’s straightforward to add or remove workers at runtime—even in mid-job—to adapt to changing work conditions or speed up job processing. Similar to Hadoop, doRedis is fault-tolerant in that failed tasks are automatically resubmitted to their job queue.

doRedis supports Linux, Mac OS X, and Windows systems.

RevoScale R and RevoConnectR (RHadoop)

Revolution Analytics is a company that provides R tools, support, and training. They have two products of note.

First up is the ...

Get Parallel R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.