September 2023
Intermediate to advanced
240 pages
7h 40m
English
The previous chapter described a number of small scripts for personal use, often idiosyncratic or specialized. In this chapter, we’re going to do something that is also typical of how Awk is used in real life: we’ll use it along with other tools to informally explore some real data, with the goal of seeing what it looks like. This is called exploratory data analysis or EDA, a term first used by the pioneering statistician John Tukey.
Tukey invented a number of basic data visualization techniques like boxplots, inspired the statistical programming language S that led to the widely-used R language, co-invented the Fast Fourier Transform, and coined the words “bit” and “software.” The authors knew John Tukey as a friend ...