Chapter 9. Tools of the Trade

In the preceding section, we covered all the foundational elements of NLP and how to develop NLP models. Starting with this chapter, we’ll cover what you should begin to think about as you come out of the wonderful world of training magnificent models on carefully curated datasets and into the mess that is the real world.

In this chapter specifically, we will discuss mainstream machine learning software and the choices you will face as you decide what to include in your stack. Then, in Chapter 10, we build custom web apps for machine learning and data science using an easy-to-use open source Python library called Streamlit, and we will conclude this section (in Chapter 11) with model deployment at scale using software from the industry leader, Databricks. By the end of these three chapters, you will have a good understanding of how to productionize machine learning models as web apps, APIs, and machine learning pipelines.

Let’s start with a topic many developers love spending inordinate amounts of time arguing over: tools.

People who should probably be spending their time coding, love hashing out the standard TensorFlow versus PyTorch or best programming language debates on endlessly long Twitter threads, but we want to take a step back and talk about some of the more practical decisions you’ll have to wrestle with in the real world. After all, “applied” is in the title of this book.

Here are a few obligatory disclaimers:

  • It is almost certain that ...

Get Applied Natural Language Processing in the Enterprise now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.