Get the Data, Build the Network

This section uses Wikipedia.

The first half of the project script consists of the initialization prologue and a heavy-duty loop that retrieves the Wikipedia pages and simultaneously builds the network of nodes and edges.

Let’s first import all necessary modules. We will need the module wikipedia for fetching and exploring Wikipedia pages, the operator itemgetter for sorting a list of tuples, and, naturally, networkx itself.

To target the snowballing process, define the constant SEED, the name of the starting page. As a side note, by changing the name of the seed page, you can apply this analysis to any other subject on Wikipedia.

Last but not least, when you start the snowballing, you will eventually (and ...

Get Complex Network Analysis in Python now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.