© Andreas François Vermeulen 2018
Andreas François VermeulenPractical Data Sciencehttps://doi.org/10.1007/978-1-4842-3054-1_9

9. Process Superstep

Andreas François Vermeulen1 
West Kilbride North Ayrshire, UK

The Process superstep adapts the assess results of the retrieve versions of the data sources into a highly structured data vault that will form the basic data structure for the rest of the data science steps. This data vault involves the formulation of a standard data amalgamation format across a range of projects.


If you follow the rules of the data vault, it results in a clean and stable structure for your future data science.

The Process superstep is the amalgamation process that pipes your data sources into five main categories of data ...

Get Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.