Skip to Content
Python机器学习手册:从数据预处理到深度学习
book

Python机器学习手册:从数据预处理到深度学习

by Chris Albon
July 2019
Intermediate to advanced
365 pages
8h 13m
Chinese
Publishing House of Electronics Industry
Content preview from Python机器学习手册:从数据预处理到深度学习
28
2
加载数据
y
make_blobs
的文档
http://bit.ly/2FqKMAZ
2.3
 加载
CSV
文件
问题描述
加载以逗号为分隔符(
Comma-Separated Values
CSV
)的文件。
解决方案
使用
pandas
库的
read_csv
来加载一个本地或远端的
CSV
文件
#
加载库
import pandas as pd
#
创建
URL
url = 'https://tinyurl.com/simulated_data'
#
加载数据集
dataframe = pd.read_csv(url)
#
查看前两行数据
dataframe.head(2)
integer datetime category
0 5 2015-01-01 00:00:00 0
1 5 2015-01-01 00:00:01 0
讨论
在加载
CSV
文件时,有两件事值得注意。第一,在加载数据前快速地浏览一下文件内
容会很用,这便于你事先了解数据集的结构以及在加载文件时需要设置什么参数。第
二,
read_csv
的参数超过了
30
个,因此读文档会是一件很痛苦的事情。幸运的是,那
些参数大部分是用来处理不同的
CSV
格式的。举个例子,读取
CSV
文件的字段时
常依赖于一个假设,即值是由逗号分隔的(例如,可能有一行数据为
2, "2015-01-01
00:00:00" ,0
),但是对于
CSV
文件来说,使用其他的字符作为分隔符也很常见,比如
制表符。
pandas
sep
参数可以设置文件的定界符。
CSV
文件一般会有一个固定的格式 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

精通特征工程

精通特征工程

Alice Zheng, Amanda Casari
精通機器學習

精通機器學習

Aurélien Géron
Python数据分析基础

Python数据分析基础

Clinton W. Brownley

Publisher Resources

ISBN: 9787121369629