This method is the simplest approach to feature selection, and it's often used as the baseline. It simply removes all the features which have small variance; typically, lower than the one set. By default, the VarianceThresholder object removes all the zero-variance features, but you can control it with the threshold parameter.
Let's create a small dataset composed of 10 observations and 5 features, 3 of them informative:
In: from sklearn.datasets import make_classification X, y = make_classification(n_samples=10, n_features=5, n_informative=3, n_redundant=0, random_state=101)
Now, let's measure their Variance:
In: print ("Variance:", np.var(X, axis=0))Out: Variance: [ 2.50852168 1.47239461 0.80912826 1.51763426 ...