You have already seen a classification algorithm, which employed a hyperplane to classify records among categories. Decision trees are indeed another way to meet the same objective. What you do with decision trees is split your records into groups based on the values of explanatory variables, seeking the grouping able to minimize the residual sum of squares we have already talked about.
As is often the case, the devil is in the details, and with decision trees, the relevant detail is how we define the grouping. This is actually obtained through recursive binary splitting. I will help you understand how it works with a set of my wonderful sketches. It all starts with a blank sheet. ...