To recap, there are two important parts that are fundamental to the STL algorithm:
- The width used for smoothing
- The periods in the dataset
When we look at the CO2 dataset, we can count the periods by counting the number of peaks in the chart. I counted 60 peaks. This corresponds to the fact that the observatory has been collecting data for the past 60 years.
From here, we move from the hard sciences of statistics into the softer realms of interpretation. This is often true in data science and machine learning—we often have to use our intuition to guide us.
In this case, we have a hard starting point: there has been 60 years so we expect at least 60 periods. Another starting point can be found in the notes of the dataset itself: ...