July 2018
Beginner to intermediate
406 pages
9h 55m
English
Before we can train a classifier to distinguish between good and bad answers, we have to create the training data. So far, we only have a bunch of data. We still need to define labels.
Of course, we could simply take the best and worst-scoring answer per question as positive and negative examples. However, what do we do with questions that have only good answers, say, one with two and the other with four points? Should we really take the answer with two points as a negative example just because it happened to be the one with the lower score? Or let's say that we have only two negative answers, one with a score of -2 and the other with -4. Clearly, we cannot take the answer with -2 as a positive example.
We will ...
Read now
Unlock full access