O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Running a summary of the dataframe and saving the object

Next, we will create a summary of the dataframe. However, instead of just printing the summary as we have done in the past, we will save the results of the summary to an object as well, since we will be using the aggregate information later in order to join it back to the original data. Again, if you will be manipulating large datasets, there is no need to run summary() functions more than once on the same data:

  1. Set up some global options, such as thue number of significant digits, plot width and height, etc.
  2. Assign the summary output to an dataframe.
  3. Use the databricks display() function (or head() function) to print a portion of the file:
        options(digits=3)  options(repr.plot.width ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required