We'll now apply the k-means algorithm to cluster the countries together:

>>> km = KMeans(3, init='k-means++', random_state = 3425) # initialize>>> km.fit(df.values)>>> df['countrySegment'] = km.predict(df.values)>>> df[:5]

After the preceding code is executed we'll get the following output:

Let's find the average GDP per capita for each country segment:

>>> df.groupby('countrySegment').GDPperCapita.mean()>>> countrySegment0 13800.5862071 1624.5384622 29681.625000Name: GDPperCapita, dtype: float64

We can see that cluster `2`

has the highest average GDP per capita and we can assume that this includes developed countries. ...

Start Free Trial

No credit card required