Classifying data points with Random Forest model using MLib
In this recipe, we will demonstrate how you can classify data points using Random Forest algorithm with MLib.
- You will be using the Maven project you created in the recipe named Solving simple text mining problems with Apache Spark. If you have not done so yet, then follow steps 1-6 in the Getting ready section of that recipe.
- Go to https://github.com/apache/spark/blob/master/data/mllib/sample_binary_classification_data.txt, download the data, and save as
rf-data.txtin the data folder of your project that you created by following the instruction in step 1. Alternatively, you can create a text file named
rf-data.txtin the data folder of your project and copy-paste the data ...