10 Best practices in developing NLP applications

This chapter covers

Making neural network inference more efficient by sorting, padding, and masking tokens
Applying character-based and BPE tokenizationfor splitting text into tokens
Avoiding overfitting via regularization
Dealing with imbalanced datasets by using upsampling, downsampling, and loss weighting
Optimizing hyperparameters

We’ve covered a lot of ground so far, including deep neural network models such as RNNs, CNNs, and the Transformer, and modern NLP frameworks such as AllenNLP and Hugging Face Transformers. However, we’ve paid little attention to the details of training and inference. For example, how do you train and make predictions efficiently? How do you avoid having your model ...

Get Real-World Natural Language Processing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Real-World Natural Language Processing by Masato Hagiwara

10 Best practices in developing NLP applications

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly