January 2018
Intermediate to advanced
332 pages
7h 36m
English
Now, let's break the dataset down further and calculate the entropy based on each branch. We have the following four main branches here:
Let's first start with the branch for Outlook:
| Play | ||||
| Yes | No | Total | ||
| Outlook | Sunny | 2 | 3 | 5 |
| Overcast | 4 | 0 | 4 | |
| Rain | 3 | 1 | 4 | |
|
13 |
To calculate the entropy of the branch, we will first calculate the probability of each sub-branch and then multiply it with the entropy of that branch. We will then add the resultant entropy for each sub-branch to get the total entropy of the branch; then, we can calculate the information gain for the branch:
P(Play, Outlook) = P(Outcast) * E(4,0) + P(Sunny) * E(2,3) + P(Rain) * E(3,1)
= (4/13) * 0 ...
Read now
Unlock full access