Exploring image data

Let's begin by looking at the number of images included with each story. We'll run a value count and then plot the numbers:

dfc['img_count'].value_counts().to_frame('count') 

This should display an output similar to the following:

Now, let's plot that same information:

fig, ax = plt.subplots(figsize=(8,6)) 
y = dfc['img_count'].value_counts().sort_index() 
x = y.sort_index().index 
plt.bar(x, y, color='k', align='center') 
plt.title('Image Count Frequency', fontsize=16, y=1.01) 
ax.set_xlim(-.5,5.5) 
ax.set_ylabel('Count') 
ax.set_xlabel('Number of Images') 

This code generates the following output:

Already, I'm surprised by ...

Get Python Machine Learning Blueprints - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.