October 2017
Beginner to intermediate
547 pages
12h 16m
English
Accessing data is a necessary first step for using most of the tools in this book. I’m going to be focused on data input and output using pandas, though there are numerous tools in other libraries to help with reading and writing data in various formats.
Input and output typically falls into a few main categories: reading text files and other more efficient on-disk formats, loading data from databases, and interacting with network sources like web APIs.
pandas features a number of functions for reading
tabular data as a DataFrame object. Table 6-1 summarizes some of them, though
read_csv is likely the one you’ll
use the most.
| Function | Description |
|---|---|
read_csv | Load delimited data from a file, URL, or file-like object; use comma as default delimiter |
read_fwf | Read data in fixed-width column format (i.e., no delimiters) |
read_clipboard | Version of read_csv
that reads data from the clipboard; useful for converting tables
from web pages |
read_excel | Read tabular data from an Excel XLS or XLSX file |
read_hdf | Read HDF5 files written by pandas |
read_html | Read all tables found in the given HTML document |
read_json | Read data from a JSON (JavaScript Object Notation) string representation |
read_msgpack | Read pandas data encoded using the MessagePack binary format |
read_pickle | Read an arbitrary object stored in Python pickle format |
read_sas | Read a SAS dataset stored in one of the SAS system’s custom ... |
Read now
Unlock full access