
✐
✐
“4137X˙CH02˙Akerkar” — 2007/9/20 — 10:12 — page 60 — #42
✐
✐
✐
✐
✐
✐
60 CHAPTER 2 Information Retrieval
If we sort the documents using the values of similarity functions, we will get the following order:
D0 (0.7867), D6 (0.4953), D2 (0.3361), D1 (0.2590), D5 (0.2215), D4 (0.1208), D3 (0.0969).
The new similarity scores are in parentheses. In this case, the document ranking remains
unchanged.
Calculations of idf
j
with static document collections are a feasible programming exercise
(left to the readers as a project in the Exercise section). If the document collection is constantly
changing, it is not easy to obtain the values of N and n
j
. One of the possibilities ...