book

Python Machine Learning Cookbook

by Prateek Joshi, Vahid Mirjalili

June 2016

Beginner to intermediate

304 pages

6h 24m

English

Packt Publishing

Read now

Unlock full access

Content preview from Python Machine Learning Cookbook

Estimating the income bracket

We will build a classifier to estimate the income bracket of a person based on 14 attributes. The possible output classes are higher than 50K or lower than or equal to 50K. There is a slight twist in this dataset in the sense that each datapoint is a mixture of numbers and strings. Numerical data is valuable, and we cannot use a label encoder in these situations. We need to design a system that can deal with numerical and non-numerical data at the same time. We will use the census income dataset available at https://archive.ics.uci.edu/ml/datasets/Census+Income.

How to do it…

We will use the income.py file already provided to you as a reference. We will use a Naive Bayes classifier to achieve this. Let's import a couple ...