Skip to Content
Python文本分析
book

Python文本分析

by Jens Albrecht, Sidharth Ramachandran, Christian Winkler
August 2022
Intermediate to advanced
441 pages
11h 26m
Chinese
China Electric Power Press Ltd.
Content preview from Python文本分析
文本数据的情感分析
341
户。在上述情况下,我们可能并没有太多数据来训练模型,但是高准确率非常重要。
我们知道,情感会受到单词所在上下文的影响,而且使用预训练模型可以提高情感
预测。这样,我们就能够突破数据集的限制,从语言的常见用法中学习知识。
在这个案例中,我们将使用
Transformers
库,它提供了非常易用的功能,而且广泛
支持多种预训练模型。有关这个话题的讨论请参见下面的介绍。
Transformers
库一
直在不断更新,这要归功于多位研究人员的贡献。
选择
Transformers
虽然很多研究小组都推出了优秀的深度学习模型,但是目前它们都各成一派,
不同框架之间并不兼容。例如,BERT 模型主要是谷歌研究小组在 TensorFlow
上开发出来的,无法在 PyTorch 上运行。因此,如果有人想使用 PyTorch
则必须移植或重写所有的代码。此外,这些深度学习模型并没有标准的输入
格式以及其他命名约定,因此很难实现标准化的方法。为此,我们可以考虑
Hugging Face 提供的 Transformers 库(
https://oreil.ly/F0Vy7
)。使用这个库有
两个主要的优势:
Transformers 库的帮助下,选择不同的预训练模型只需修改变量值即可。
该库实现并提供了最近两年开发的大多数标杆模型,如 BERTGPT-2 等。
该库可以在 PyTorch TensorFlow 中运行。这是到 2020 年为止最流行的
两大深度学习框架,如果使用 Transformers ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

精益AI

精益AI

Lomit Patel
构建知识图谱

构建知识图谱

Jesus Barrasa, Jim Webber
写给系统管理员的Python脚本编程指南

写给系统管理员的Python脚本编程指南

Posts & Telecom Press, Ganesh Sanjiv Naik

Publisher Resources

ISBN: 9787519864446