Basic Statistical Techniques
We will use the following notation to improve the readability of this chapter:
- an uppercase italic letter will either denote a random event (in this case we will use letters from early in the alphabet, e.g. A) or a random variable (in this case we will use one of the later letters of the alphabet, e.g. X);
- a lowercase italic letter will denote either a deterministic variable or a constant;
- a lowercase Greek letter will denote a parameter of a random variable model or an analytic function;
- r.v. stands for random variable;
- s – stands for ‘stochastically’ (i.e. from a probabilistic viewpoint).
3.1 To Explore Data
3.1.1 Fundamental Concepts and Phases of the Exploratory Data Analysis
Exploratory data analysis (EDA) aims to rapidly acquire a summary of the salient characteristics of the observed phenomenon, mainly using graphical tools. The EDA is usually the start-up phase of an investigation and is generally characterised by four steps:
- data collection and arrangement;
- data analysis (or processing);
- presentation of the results of the analysis;
- interpretation and discussion of the results.
Data collection requires one to define the objectives of the analysis itself. The objectives should, if possible, be clear, precise and unambiguous. They should be explicit in the temporal and spatial dimensions. Of course, the degree of complexity of the phenomenon to be analysed, its nature and the objectives of the analysis affect the collection modalities, ...