At this point, we can introduce a more general method of creating boosted ensembles. Let's choose a generic algorithm family, represented as follows:
Each model is parametrized using the vector θi and there are no restrictions on the kind of method that is employed. In this case, we are going to consider decision trees (which is one of the most diffused algorithms when this boosting strategy is employed—in this case, the algorithm is known as gradient tree boosting), but the theory is generic and can be easily applied to more complex models, such as neural networks. In a decision tree, the parameter vector θi is made up of ...