Classification is the most accurate when the system is trained with blank forms (a form that has not been completed). If blank forms are not available, accurate classification can still be achieved.
The first option is to redact (remove sensitive and instance-unique data) on the samples you have before uploading them to Ephesoft for training.
The second option involves editing the HOCR file that is created after clicking on Learn Files in the Batch Class Management administrative interface. The HOCR file is the XML representation of the OCR output.
The XML file can be edited to remove any content that is not part of the blank form. After the XML file is updated, click on Learn Files again to update the index ...