15DATA AUGMENTATION FOR TEXT
How is data augmentation useful, and what are the most common augmentation techniques for text data?
Data augmentation is useful for artificially increasing dataset sizes to improve model performance, such as by reducing the degree of overfitting, as discussed in Chapter 5. This includes techniques often used in computer vision models, like rotation, scaling, and flipping.
Similarly, there are several techniques for augmenting text data. The most common include synonym replacement, word deletion, word position swapping, sentence shuffling, noise injection, back translation, and text generated by LLMs. This chapter ...
Get Machine Learning Q and AI now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.