Feature Engineering for Natural Language Data

In the previous chapter, we explored how to extract features from numerical data and images. We explored a few algorithms that are used for that purpose. In this chapter, we’ll continue with the algorithms that extract features from natural language data.

Natural language is a special kind of data source in software engineering. With the introduction of GitHub Copilot and ChatGPT, it became evident that machine learning and artificial intelligence tools for software engineering tasks are no longer science fiction. Therefore, in this chapter, we’ll explore the first steps that made these technologies so powerful – feature extraction from natural language data.

In this chapter, we’ll cover the

Get Machine Learning Infrastructure and Best Practices for Software Engineers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.