Vector space model implementation

Now we have enough background information and are ready to proceed to the code. 

First, suppose that we have a text file where each line is a document, and we want to index the content of this file and be able to query it. For example, we can take some text from https://ocw.mit.edu/ans7870/6/6.006/s08/lecturenotes/files/t8.shakespeare.txt and save it to simple-text.txt.

Then we can read it this way:

Path path = Paths.get("data/simple-text.txt");List<List<String>> documents = Files.lines(path, StandardCharsets.UTF_8)     .map(line -> TextUtils.tokenize(line))     .map(line -> TextUtils.removeStopwords(line))     .collect(Collectors.toList());

We use the Files class from the standard library, and then use two functions: ...

Get Java: Data Science Made Easy now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.