book

Python Machine Learning By Example - Second Edition

February 2019

Beginner to intermediate

382 pages

10h 1m

English

Read now

Unlock full access

Exercises

In the one-hot encoding solution, can you use different classifiers supported in PySpark instead of logistic regression, such as decision tree, random forest, and linear SVM?
In the feature hashing solution, can you try other hash sizes, such as 5,000, and 20,000? What do you observe?
In the feature interaction solution, can you try other interactions, such as C1 and C20?
Can you first use feature interaction and then feature hashing in order to lower the expanded dimension? Are you able to obtain higher AUC?

Yuxi (Hayden) Liu

Wei-Meng Lee

Chris Albon

Sebastian Raschka, Jared Huffman, Vahid Mirjalili, Ryan Sun