February 2019
Beginner to intermediate
382 pages
10h 1m
English
With the encoded training and testing set ready, we can now train our classification model. We use logistic regression as an example, but there are many other classification models supported in PySpark, such as decision tree classifiers, random forests, neural networks (which we will be studying in Chapter 9, Stock Price Prediction with Regression Algorithms), linear SVM, and Naïve Bayes. For further details, please refer to the following link: https://spark.apache.org/docs/latest/ml-classification-regression.html#classification.
We train and test a logistic regression model by the following steps:
>>> from pyspark.ml.classification ...