Skip to Content
Web机器学习
book

Web机器学习

by Posts & Telecom Press, Andrea Isoni
May 2024
Intermediate to advanced
234 pages
3h 58m
Chinese
Packt Publishing
Content preview from Web机器学习

第8章 影评情感分析应用

在本章中,我们用前几章介绍的算法和方法,开发一套能够判断影评情感倾向的情感分析系统。我们还将用Scrapy库,通过搜索引擎API(Bing搜索引擎)从不同的网站采集影评数据。我们用newspaper库或预先定义好的HTML页面抽取规则,从影评数据抽取影评内容和电影名称。我们用朴素贝叶斯分类器,以包含分类信息最多(使用X2检测)的词语作为特征,得到每条影评的情感倾向,第4章讲过该方法。我们用第4章讲过的PageRank算法,计算与每个电影查询词相关的网页次序。本章将讨论影评情感分析应用的代码,包括Django的model和view,我们用Scrapy库的scraper从网页采集影评数据。我们首先给出Web应用的样例,解释我们使用的搜索引擎API和将其整合到应用的方法。然后,讲解影评的采集方法:将Scrapy库整合到Django、编写存储数据的model和管理应用的主要命令。本章讨论的这些代码均已放到作者的GitHub仓库chapter_8文件夹,地址为https://github.com/ ai2010/machine_learning_for_the_web/tree/master/chapter_8。

首页展示效果见图8.1:

..\17-0128 图\p236a.tif

图8.1

如果用户想知道影评的情感倾向和相关性,他们输入电影的名称进行查询即可。例如,图8.2显示的是电影Batman vs Superman Dawn of Justice[1]影评情感分析结果: ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

人工智能技术与大数据

人工智能技术与大数据

Posts & Telecom Press, Anand Deshpande, Manish Kumar
神经网络算法与Java编程

神经网络算法与Java编程

Posts & Telecom Press, Fabio M. Soares, Alan M. F. Souza
Python图像处理实战

Python图像处理实战

Posts & Telecom Press, Sandipan Dey
面向MapReduce的Hadoop优化

面向MapReduce的Hadoop优化

Posts & Telecom Press, Khaled Tannir

Publisher Resources

ISBN: 9781836203612