As we mentioned earlier, we have implemented the assignment and update phases as tasks to be implemented in the fork/join framework.
The assignment phase assigns a document to the cluster that has the lowest Euclidean distance from the document, so we have to process all the documents and calculate the Euclidean distances of all the documents and all the clusters. We are going to use the number of documents that a task has to decide whether we have to split the task or not. We start with the tasks that have to process all the documents, and we are going to split them until we have tasks that have to process a number of documents lower than a predefined size.
The AssignmentTask ...