Tree-based approaches are non-statistical approaches without many of the assumptions that come along with things like regression. However, there are some pitfalls to keep in mind:
- Single decision tree models can easily overfit to your data, especially if you do not limit the depth of the trees. Most implementations allow you to limit this depth via a parameter (or prune your decision trees). A pruning parameter will often allow you to remove sections of the tree that have little influence on the predictions, and, thus, reduce the overall complexity of the model.
- When we start talking about ensemble models, such as random forest, we are getting into models that are somewhat opaque. ...