June 2017
Beginner to intermediate
576 pages
15h 22m
English
Take a look at the following tips:
Sample when possible. Use the sample_bin methodology and filter command liberally. Sampling will speed up analysis both for the analysis phase and for the development/testing phase.
Once testing has been completed on a smaller segment, it can be scaled up to a much larger population with confidence.
Preprocess the data so that you can subselect potentially interesting sub segments.
Cache analysis when it makes sense.
If performance becomes a factor, try a larger number of partitions in your data.
For larger number crunching, bring back a representative sample to local R.