O'Reilly logo

Guerrilla Analytics by Enda Ridge

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4

Stage 1: Data Extraction

Summary

This chapter describes the Guerrilla Analytics workflow stage of Data Extraction. It will discuss the pitfalls and risks associated with extracting data from systems. We then make a set of recommendations that apply Guerrilla Analytics principles to reduce these risks, avoid these pitfalls, and maintain data provenance.

Keywords

Data Extraction
File Formats
Checksums

4.1. Guerrilla Analytics workflow

Data Extraction is the first stage in the Guerrilla Analytics workflow (Section 2.1), as illustrated in Figure 9. It involves taking data out of some system or location so it can be brought into the analytics team’s Data Manipulation Environment (DME). The place the data is extracted from is called ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required