© Andreas François Vermeulen 2018
Andreas François VermeulenPractical Data Sciencehttps://doi.org/10.1007/978-1-4842-3054-1_7

7. Retrieve Superstep

Andreas François Vermeulen1 
(1)
West Kilbride North Ayrshire, UK
 

The Retrieve superstep is a practical method for importing completely into the processing ecosystem a data lake consisting of various external data sources. The Retrieve superstep is the first contact between your data science and the source systems. I will guide you through a methodology of how to handle this discovery of the data up to the point you have all the data you need to evaluate the system you are working with, by deploying your data science skills.

The successful retrieval of the data is a major stepping-stone to ensuring that ...

Get Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.