Case study – predicting political affiliation

For our next use case, we will use congressional voting records from the US House of Representatives to build a classification tree in order to predict whether a given congressman or woman is a Republican or a Democrat.

The specific congressional voting dataset that we will use is available from both the GitHub repository accompanying this book and UCI's machine learning repository at It has been cited by Dua, D., and Karra Taniskidou, E. (2017). UCI Machine Learning Repository []. Irvine, CA: University of California, School of Information and Computer Science.

If you open congressional-voting-data/ ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.