April 2019
Intermediate to advanced
320 pages
6h 42m
English
While NumPy arrays are a much‐improved N‐dimensional array object version over Python's list, it is insufficient to meet the needs of data science. In the real world, data are often presented in table formats. For example, consider the content of the CSV file shown here:
,DateTime,mmol/L0,2016-06-01 08:00:00,6.11,2016-06-01 12:00:00,6.52,2016-06-01 18:00:00,6.73,2016-06-02 08:00:00,5.04,2016-06-02 12:00:00,4.95,2016-06-02 18:00:00,5.56,2016-06-03 08:00:00,5.67,2016-06-03 12:00:00,7.18,2016-06-03 18:00:00,5.99,2016-06-04 09:00:00,6.610,2016-06-04 11:00:00,4.111,2016-06-04 17:00:00,5.912,2016-06-05 08:00:00,7.613,2016-06-05 12:00:00,5.114,2016-06-05 18:00:00,6.915,2016-06-06 08:00:00,5.016,2016-06-06 12:00:00,6.117,2016-06-06 18:00:00,4.918,2016-06-07 08:00:00,6.619,2016-06-07 12:00:00,4.120,2016-06-07 18:00:00,6.921,2016-06-08 08:00:00,5.622,2016-06-08 12:00:00,8.123,2016-06-08 18:00:00,10.924,2016-06-09 08:00:00,5.225,2016-06-09 12:00:00,7.126,2016-06-09 18:00:00,4.9
The CSV file contains rows of data that are divided into three columns—index, date and time of recording, and blood glucose readings in mmol/L. To be able to deal with data stored as tables, you need a new data type that is more suited to deal with it—Pandas. While Python supports lists and dictionaries for manipulating structured data, it is not well suited for manipulating numerical tables, such as the one stored in the CSV ...