O'Reilly logo

Genetic Algorithms and Machine Learning for Programmers by Frances Buontempo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Let’s Find That Paper Bag

Using the saved data from the previous chapter, you first need to load it:

 import​ ​pickle
 
 with​ open(​"data"​, ​"rb"​) ​as​ f:
  L = pickle.load(f)

Your data is a list of lists. Each inner list has three items: [’x’, ’y’, ’out’]. Your tree will predict the last item: ’out’. You’ll provide a label for each column to help make readable rules from the tree which will be built up of sub-trees using split points. But first, you need to find the split points.

Find Split Points

You use information gain to find split points. This needs the proportions of data in each group. You can use the collections library to find counts of each value, which gets you most of the way there.

Try it on a list of numbers:

 import ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required