As we mentioned before, this class implements the tasks that are going to calculate the final keyword list. It implements the Runnable interface, so we can execute them as a Thread, and internally uses some attributes, most of which are shared between all the tasks:
- Two ConcurrentHashMap objects to store the global vocabulary and the global keywords: We use the ConcurrentHashMap because these objects are going to be updated by all the tasks, so we have to use a concurrent data structure to avoid race conditions.
- Two ConcurrentLinkedDeque of File objects, to store the list of files that forms the document collection: We use the ConcurrentLinkedDeque class because all the tasks are going to extract (get and ...