February 2018
Intermediate to advanced
378 pages
10h 14m
English
Usually, if you have to label a big dataset manually, you split it into manageable batches. Several people can then work in parallel on different portions. The problem here is that each of those people will introduce a different amount of variability in his/her batch. This is especially the case when subjective opinions are involved, such as "Is this movie review slightly positive or rather neutral?"
Batch effect is also a common problem for datasets that were compiled from several different sources. In many cases, batch effects become apparent when you plot the data obtained from different sources separately.
Read now
Unlock full access