Less than a decade ago, neural networks were not the best at image processing. The reason, other than data and CPU power, is that researchers were using dense layers. When stacking several layers and dense layers connecting thousands of pixels to, say, a thousand hidden units, we ended up with a non-convex cost function to optimize that had millions of parameters.
The curse of dimensionality was thus very much an issue, even the biggest databases may not have been enough. But let's go back to the introduction. Machine learning is not just training a model, it's also about feature processing. In image processing, people used lots of different tools to extract features from an image, but one common tool for all ...