Skip to Main Content
Building an Intelligent Web: Theory and Practice
book

Building an Intelligent Web: Theory and Practice

by Pawan Lingras, Rajendra Akerkar
March 2010
Intermediate to advanced content levelIntermediate to advanced
326 pages
12h 25m
English
Jones & Bartlett Learning
Content preview from Building an Intelligent Web: Theory and Practice
“4137X˙CH02˙Akerkar” 2007/9/20 10:12 page 31 #13
2.2 Document Representation 31
Readers should see the effects of running the Stemmer on Tokenized-d1.txt from the subdi-
rectory fig2.3 by typing the following command:
java cp ../java Stemmer Tokenized-d1.txt
(Command 2.2)
The results of redirecting the output from (Command 2.2) appear in the file Stemmedd1.txt
in the subdirectory fig2.3.
2.2.2 Term-Document Matrix
Term-document matrix (TDM) is a two-dimensional representation of a document collection.
The rows of the matrix represent various documents, and the columns correspond to various
index terms. The values in the matrix can be either ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Reinventing the Organization for GenAI and LLMs

Reinventing the Organization for GenAI and LLMs

Ethan Mollick

Publisher Resources

ISBN: 9780763741372