Now that we have detected the regions, we must crop the text before submitting it to the OCR. We could simply use a function like getRectSubpix or Mat::copy, using each region rectangle as a region of interest (ROI) but, since the letters are skewed, some undesired text may be cropped as well. For example, this is what one of the regions would look like if we just extract the ROI based on its given rectangle:
Fortunately, ERFilter provides us with an object called ERStat, which contains pixels inside each extremal region. With these pixels, we could use OpenCV's floodFill function to reconstruct each letter. This function is ...