Vector space model implementation

Now we have enough background information and are ready to proceed to the code. 

First, suppose that we have a text file where each line is a document, and we want to index the content of this file and be able to query it. For example, we can take some text from https://ocw.mit.edu/ans7870/6/6.006/s08/lecturenotes/files/t8.shakespeare.txt and save it to simple-text.txt.

Then we can read it this way:

Path path = Paths.get("data/simple-text.txt");List<List<String>> documents = Files.lines(path, StandardCharsets.UTF_8)     .map(line -> TextUtils.tokenize(line))     .map(line -> TextUtils.removeStopwords(line))     .collect(Collectors.toList());

We use the Files class from the standard library, and then use two functions: ...

Get Mastering Java for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.