Skip to Content
数据科学实战手册
book

数据科学实战手册

by Posts & Telecom Press, Tony Ojeda, Sean Patrick Murphy, Bengfort Benjamin
May 2024
Intermediate to advanced
357 pages
5h 3m
Chinese
Packt Publishing
Content preview from 数据科学实战手册

第10章 获取和定位Twitter数据(Python)

本章介绍如下内容。

  • 创建Twitter应用
  • 了解Twitter API v1.1
  • 获取粉丝和朋友信息
  • 获取Twitter用户信息
  • 避免Twitter速度限制
  • 存储JSON数据至硬盘
  • 安装MongoDB
  • 利用PyMongo将用户信息存入MongoDB
  • 探索用户地理信息
  • 利用Python绘制地理分布图

本章中我们将会利用网络服务的RESTful API来获取社交媒体数据并进行分析。Twitter作为一种微博式的社交网络,拥有大量具有数据挖掘价值尤其是文本挖掘价值的数据流。Twitter还提供了极为便利的API服务。本章将会介绍利用Python调用这些API。我们将会利用Twitter的API来获得社交网络关系,将内容以JSON的格式保存在传统的文件存储以及新近流行的NoSQL数据库MongoDB中。接下来将进一步分析这些社会关系中的地理关联并将这种关联进行可视化。在本章中,你将会发现这一类API在设计和使用上的一些模式。使用这些API是数据科学中一个重要的主题。更好地理解它们,可以帮助你解开一个全新的数据世界,让你接触更加海量的数据并进行分析。

API是应用编程接口(Application Programming Interface)的简写。在传统计算机科学中,它表示那些可以让不同软件程序之间相互调用的方法。现如今,越来越多的API是一种互联网API——通过互联网在不同的软件和网络应用(如Twitter)之间共享数据。获取并管理数据是数据科学过程中重要的一环,了解如何使用这些API是从互联网获取数据不可或缺的一步。

RESTful API是一种被众多互联网应用所广泛使用的特殊API。我们可以忽略很多技术术语,但是REST是必须被介绍的。REST意为表现状态传输(Representational ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Python编程入门与实战

Python编程入门与实战

Posts & Telecom Press, Fabrizio Romano
软件开发实践:项目驱动式的Java开发指南

软件开发实践:项目驱动式的Java开发指南

Raoul-Gabriel Urma, Richard Warburton
编写整洁的Python代码(第2版)

编写整洁的Python代码(第2版)

Posts & Telecom Press, Mariano Anaya

Publisher Resources

ISBN: 9781836206774