Chapter 4

Stage 1: Data Extraction


This chapter describes the Guerrilla Analytics workflow stage of Data Extraction. It will discuss the pitfalls and risks associated with extracting data from systems. We then make a set of recommendations that apply Guerrilla Analytics principles to reduce these risks, avoid these pitfalls, and maintain data provenance.


Data Extraction
File Formats

4.1. Guerrilla Analytics workflow

Data Extraction is the first stage in the Guerrilla Analytics workflow (Section 2.1), as illustrated in Figure 9. It involves taking data out of some system or location so it can be brought into the analytics team’s Data Manipulation Environment (DME). The place the data is extracted from is called ...

Get Guerrilla Analytics now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.