# get the part after the last / in URL and use as filename
file = url.split("/")[-1]
r = s.get(url)
if r.ok:
with open(file, "w+b") as f:
f.write(r.text.encode('utf-8'))
else:
print("error with URL %s" % url)
输出结果:
CPU times: user 117 ms, sys: 7.71 ms, total: 124 ms
Wall time: 314 ms
根据互联网的连接状况,上述代码可能需要更长的时间才能运行完成,但是应该很
快。我们利用会话抽象实现了
HTTP
持续连接、
SSL
会话缓存等,以最大化运行速度。
下载
URL
时使用适当的错误处理
在下载
URL
的时候,你需要使用网络协议与远程服务器通信。这期间
可能会发生各种各样的错误,比如
URL
变化,服务器不响应等。上述
示例仅展示了一个错误信息,然而,在现实世界中,你的解决方案应该
会更复杂。
3.10
案例:利用
wget
下载
HTML
页面
当需要大量下载页面时,
wget
(
https://oreil.ly/wget
)是一个非常好的工具,
这是一款几乎适用所有平台的常用命令行工具。
Linux
和
macOS
都应该
安装了
wget
,你也可以通过包管理器轻松完成安装。对于
Windows ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month, and much more.
O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.