The KeywordExtractionTask class

As we mentioned before, this class implements the tasks that are going to calculate the final keyword list. It implements the Runnable interface, so we can execute them as a Thread, and internally uses some attributes, most of which are shared between all the tasks:

  • Two ConcurrentHashMap objects to store the global vocabulary and the global keywords: We use the ConcurrentHashMap because these objects are going to be updated by all the tasks, so we have to use a concurrent data structure to avoid race conditions.
  • Two ConcurrentLinkedDeque of File objects, to store the list of files that forms the document collection: We use the ConcurrentLinkedDeque class because all the tasks are going to extract (get and ...

