© Pramod Singh 2019
Pramod SinghMachine Learning with PySpark https://doi.org/10.1007/978-1-4842-4131-8_6

6. Random Forests

Pramod Singh1 
(1)
Bangalore, Karnataka, India
 

This chapter focuses on building Random Forests (RF) with PySpark for classification. We will learn about various aspects of them and how the predictions take place; but before knowing more about random forests, we have to learn the building block of RF that is a decision tree (DT). A decision tree is also used for Classification/Regression. but in terms of accuracy, random forests beat DT classifiers due to various reasons that we will cover later in the chapter. Let’s learn more about decision trees.

Decision Tree

A decision tree falls under the supervised category of machine learning ...

Get Machine Learning with PySpark: With Natural Language Processing and Recommender Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.