September 2019
Beginner to intermediate
494 pages
13h
English
So, in the preceding data science pipeline we just went through, there are two main sections—data cleaning (where we remove inconsistent data, fill in missing data, and appropriately encode the attributes) and data analysis (where we generate visualizations and insights from our cleaned dataset).
The data cleaning process was implemented by a Python script while the data analysis process was done with a Jupyter notebook. In general, deciding whether a Python program should be done in a script or in a notebook is quite an important, yet often overlooked aspect, while working on a data science project.
As we have discussed in the previous chapter, Jupyter notebooks are perfect for iterative development ...
Read now
Unlock full access