Performing the analyses in R
Now that we have our data ready, we will focus on performing the analyses in R.
Classification with C4.5
We will first predict the income of the participants using C4.5.
The unpruned tree
We will start by examining the unpruned tree. This is configured using the
J48() argument in
RWeka, which uses the formula notation we have seen previously. The dot (
.) after the tilde indicates that all attributes except the
class attribute have to be used. We used the
control argument to tell R that we want an unpruned tree (we will discuss pruning later):
C45tree = J48(income ~ . , data= AdultTrain, control= Weka_control(U=TRUE))
You can examine the tree by typing:
We will not display it here as it is very ...