When I tried to develop this application, I found that the photos are different shapes and sizes: some images are tall, some of them are wide, some of them are outside, some images are inside, and most of them are food. Also, images come in different shapes (most were roughly square, though), of pixel and many of them are exactly 500 x 375 in dimension:
We have already seen that CNN cannot work with images with heterogeneous shapes and sizes. There are many robust and efficient image processing techniques to extract only the region of interest (ROI ...