Having settled on the types of missing data, the question that arises is what are the approaches for categorizing it?
This section gives a detailed set of worked examples using synthetic data and a RapidMiner Studio process that is available with the files that accompany this book. These are intended to be followed with the text. The process is called
The first step is to make some synthetic data containing missing data of each type. In order to illustrate the key points, it is necessary to reduce the size of the synthetic data, so it can be easily displayed and understood. Of course, real data will not be like this, but the techniques are usable with high-dimension data.
The RapidMiner Studio process ...