Skip to Content
NLTK应用开发指南
book

NLTK应用开发指南

by Posts & Telecom Press, Nitin K Hardeniya
May 2024
Intermediate to advanced
172 pages
2h 39m
Chinese
Packt Publishing
Content preview from NLTK应用开发指南

第5章 NLP应用

这一章要来具体讨论一下NLP应用。也就是说,接下来会将用到之前章节所学到的所有概念,看看用这些概念究竟能开发出何种应用程序。因此,这将会是一个完全需要动手实践的章节。在前面的章节中,已经学习了所有NLP应用都需要执行的大部分预处理步骤,了解了如何使用标识器、POS标签、NER以及如何进行文本解析。本章要提供的是一种思路,让你了解应该如何运用之前所学到的知识开发出一些复杂的NLP应用。

如今,现实世界中已经存在着非常多的NLP应用程序,如Google Search、Siri、机器翻译、Google News、Jeopardy[1]和拼写检查等都是一些大家耳熟能详的例子。这其中的一些技术是业界人士多年来的研究成果,他们将这些技术应用到了当前的水平。NLP太复杂了,正如之前章节中所讲到的那样,像POS和NER这样的预处理步骤大部分也还都是研究性的问题。但通过使用NLTK库,我们已经在恰当的精确度范围内解决了其中的许多问题。本书不会涉及机器翻译和语音识别这样较为复杂的应用。但你现在应该已经具备了足够多的背景知识,也是时候去了解该领域的一些基本应用了。作为一个NLP爱好者,我们应该对这些NLP应用有一个基本的了解。建议读者可以去互联网上找一些NLP应用来看看,并试着去了解它们。

总而言之,本章主要包括以下内容。

  • 为读者介绍几个常见的NLP应用。
  • 利用到目前为止所学习的知识开发一个NLP应用(新闻摘要器)。
  • 介绍不同NLP应用的侧重点,以及它们各自的基本细节。

先来看一种非常复杂的NLP应用:信息摘要(summarization)。该应用的概念非常简单:对于所提供的文章、短文、故事,通常会需要针对其内容自动生成摘要。事实上,信息摘要这个应用需要具备一些深层次的NLP知识,因为这里需要了解的不单是句子的结构,而是整个文本的结构,除此之外,还得要了解该文本的体裁和主题内容。 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Python编程入门与实战

Python编程入门与实战

Posts & Telecom Press, Fabrizio Romano
高性能Spark

高性能Spark

Holden Karau, Rachel Warren
Java数据科学指南

Java数据科学指南

Posts & Telecom Press, Rushdi Shams
Python机器学习案例精解

Python机器学习案例精解

Posts & Telecom Press, Yuxi (Hayden) Liu

Publisher Resources

ISBN: 9781836205913