
375Mining Unstructured Data
Similarly, the Infogain measure of the word “system” is as below:
Infogain System EntropyEntropy
()
=
()
−
()
−
,SS
8
12
1
4
12
()
=−−
−
Entropy
log.
0
1 553
8
12
4
8
24
8
2
88
2
8
2
8
2
8
−
lo
og
−−
−
4
12
1
4
2
4
1
4
2
4
22
loglog
−
=−
1
4
1 553 067
1
4
2
log
.... .
.. .
−−
()
−−
()
−−
()
−−−
()
−−
05 1025 2025 2
033025 20511025 2
1 553 067050505033 05 05 0
()
−−
()
=− −
.
........
[][++ ++..
...
]5
1 553 1 005 0495
=−−
=
Thus, we can see, the Infogain value of the two words, namely, “realtor” and “system” is
0.424 and 0.053, respectivel ...