How to Grow a Decision Tree

You build a tree from leaf nodes and sub-trees. The algorithm looks a lot like quicksort, partitioning the data and proceeding recursively:

 ID3(data, features, tree = {}):
  if data is (mostly) in same category:
  return leaf_node(data)
  feature = pick_one(data, features)
  tree[feature]={}
  groups = partition(data, feature)
  for group in groups:
  tree[feature][group] = ID3(group, features)
  return tree

You partition the data into groups with the same value of your chosen feature. You build up sub-trees and make a leaf node when all of the data is in the same category—or it is mostly in the same category. This might be just one data item.

To decide a feature on which to partition the data, you can pick a ...

Get Genetic Algorithms and Machine Learning for Programmers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.