5

Text Augmentation

Text augmentation is a technique that is used in Natural Language Processing (NLP) to generate additional data by modifying or creating new text from existing text data. Text augmentation involves techniques such as character swapping, noise injection, synonym replacement, word deletion, word insertion, and word swapping. Image and text augmentation have the same goal. They strive to increase the size of the training dataset and improve AI prediction accuracy.

Text augmentation is relatively more challenging to evaluate because it is not as intuitive as image augmentation. The intent of an image augmentation technique is clear, such as flipping a photo, but a character-swapping technique will be disorienting to the reader. ...

Get Data Augmentation with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.