In several experiments conducted by Google researchers in the paper Revisiting Unreasonable Effectiveness of Data in Deep Learning Era, they constructed an internal dataset that contained 300 million observations, which is obviously much larger than ImageNet. They then trained several state-of-the-art architectures on this dataset, increasing the amount of data shown to the model from 10 million to 30 million, 100 million, and finally 300 million. In doing so, they showed that model performance increased linearly with the log of the number of observations used to train, showing us that more data always helps in the source domain.
But what about the target domain? We repeated the Google experiment using a few ...