In the previous chapter we have spent a fair bit of time talking about abstract concepts of graph theory. In this chapter, we shall return back to earth and start using them to analyze real social networks. We’ll take a sample dataset from the social media blogging site LiveJournal.com—specifically a group of very vocal Russian expatriates—and try to learn about their community using SNA metrics.
The first set of metrics that we approach is one called “centrality”. People new to the field often learn about degree centrality without realizing that it’s just one of a family of metrics that can be used together or separately. In this chapter we’ll explore the four most popular metrics, and learn to visualize and combine them.
But first, let’s get acquainted with the data.
LiveJournal is a lively blogging site that is very popular in Russia and Eastern Europe. It currently serves close to 38 million blogs, most of them in languages other than English. The underlying server software is open-source, and presents a simple API and a generous policy for data mining and robots. (See http://www.livejournal.com/bots/.)
We are going to perform a data gathering protocol called Snowball Sampling (see Appendix A) and obtain a dataset that is suitable for further analysis.
If you have not yet installed Python and the NetworkX environment, please refer to Appendix B.
Now, let us get oriented ...