For this task, we return to our old friend the caret package. We'll start by creating a correlation matrix, using the Spearman Rank method, then apply the findCorrelation() function for all correlations above 0.9:
df_corr <- cor(gettysburg_treated, method = "spearman")high_corr <- caret::findCorrelation(df_corr, cutoff = 0.9)
The high_corr object is a list of integers that correspond to feature column numbers. Let's dig deeper into this:
high_corr
The output of the preceding code is as follows:
[1] 9 ...