We count a number of the various occupations of our users.
The following steps were implemented to get the occupation DataFrame and populate the list, which was displayed using Matplotlib.
- Get user_data.
- Extract occupation count using groupby("occupation") and calling count() on it.
- Extract list of tuple("occupation","count") from the list of rows.
- Create a numpy array of values in x_axis and y_axis.
- Create a plot of type bar.
- Display the chart.
The complete code listing can be found following:
user_data = get_user_data() user_occ = user_data.groupby("occupation").count().collect() user_occ_len = len(user_occ) user_occ_list = [] for i in range(0, (user_occ_len - 1)): element = user_occ[i] count = element. __getattr__('count') ...