Chapter 1. Laying the Foundation for Reproducible Data Analysis

In this chapter, we will cover the following recipes:

Setting up Anaconda
Installing the Data Science Toolbox
Creating a virtual environment with virtualenv and virtualenvwrapper
Sandboxing Python applications with Docker images
Keeping track of package versions and history in IPython Notebooks
Configuring IPython
Learning to log for robust error checking
Unit testing your code
Configuring pandas
Configuring matplotlib
Seeding random number generators and NumPy print options
Standardizing reports, code style, and data access

Introduction

Reproducible data analysis is a cornerstone of good science. In today's rapidly evolving world of science and technology, reproducibility is a hot topic. Reproducibility ...

Get Python: End-to-end Data Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Python: End-to-end Data Analysis by Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins

Chapter 1. Laying the Foundation for Reproducible Data Analysis

Introduction

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly