Chapter 13. Scaling Up

Up until now, we have reviewed a steady stream of pertinent topics concerning statistics and specifically, predictive analytics. In this chapter, we look to provide a tutorial dedicated to applying those concepts and practices to very large datasets. First, we'll begin by defining the phrase very large – at least as it is used to describe data defined (that we want to train our predictive models on or run our statistical algorithms against). Next, we will review the list of the challenges imposed by using bigger data sources, and finally, we will offer some ideas for meeting these challenges.

Our chapter is broken down into the following sections:

  • Getting started
  • The phases of an analytics project
  • Experience and data of scale ...

Get Mastering Predictive Analytics with R - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.