book

Social Network Analysis for Startups

by Maksim Tsvetovat, Alexander Kouznetsov

September 2011

Beginner

188 pages

4h 57m

English

O'Reilly Media, Inc.

Read now

Unlock full access

A Note Regarding Supplemental Files
Preface
PrerequisitesOpen-Source ToolsConventions Used in This BookUsing Code ExamplesSafari® Books OnlineHow to Contact UsContent UpdatesMarch 16, 2012Thanks
1. Introduction
Analyzing Relationships to Understand People and GroupsBinary and Valued RelationshipsSymmetric and Asymmetric RelationshipsMultimode RelationshipsFrom Relationships to Networks—More Than Meets the EyeSocial Networks vs. Link AnalysisThe Power of Informal NetworksTerrorists and Revolutionaries: The Power of Social NetworksSocial Networks in PrisonInformal Networks in Terrorist CellsThe Revolution Will Be TweetedSocial Media and Social NetworksEgyptian Revolution and Twitter
2. Graph Theory—A Quick Introduction
What Is a Graph?Adjacency MatricesEdge-Lists and Adjacency Lists7 Bridges of KönigsbergGraph Traversals and DistancesDepth-First TraversalImplementationDFS with NetworkXBreadth-First TraversalAlgorithmBFS with NetworkXPaths and WalksDijkstra’s AlgorithmGraph DistanceGraph DiameterWhy This Matters6 Degrees of Separation is a Myth!Small World Networks
3. Centrality, Power, and Bottlenecks
Sample Data: The Russians are Coming!Get Oriented in Python and NetworkXRead Nodes and Edges from LiveJournalSnowball SamplingSaving and Loading a Sample Dataset from a FileCentralityWho Is More Important in this Network?Find the “Celebrities”Degree centrality in the LiveJournal networkFind the GossipmongersFind the Communication Bottlenecks and/or Community BridgesPutting It TogetherWho Is a “Gray Cardinal?”In practiceKlout ScorePageRank—How Google Measures CentralitySimplified PageRank algorithmWhat Can’t Centrality Metrics Tell Us?
4. Cliques, Clusters and Components
Components and SubgraphsAnalyzing Components with PythonIslands in the NetSubgraphs—Ego NetworksExtracting and Visualizing Ego Networks with PythonTriadsFraternity Study—Tie Stability and TriadsTriads and TerroristsThe “Forbidden Triad” and Structural HolesStructural Holes and Boundary SpanningTriads in PoliticsDirected TriadsAnalyzing Triads in Real NetworksReal DataCliquesDetecting CliquesHierarchical ClusteringThe AlgorithmClustering CitiesPreparing Data and ClusteringBlock ModelsTriads, Network Density, and Conflict
5. 2-Mode Networks
Does Campaign Finance Influence Elections?Theory of 2-Mode NetworksAffiliation NetworksAttribute NetworksA Little Math2-Mode Networks in PracticePAC NetworksCandidate NetworksExpanding Multimode NetworksExercise
6. Going Viral! Information Diffusion
Anatomy of a Viral VideoWhat Did Facebook Do Right?How Do You Estimate Critical Mass?Wikinomics of Critical MassContent is (Still) KingHeterogenous PreferencesHow Does Information Shape Networks (and Vice Versa)?Birds of a Feather?Homophily vs. CuriosityBoundary SpannersWeak TiesDunbar Number and Weak TiesA Simple Dynamic Model in PythonInfluencers in the MidstExercises for the ReaderCoevolution of Networks and InformationExercises for the ReaderWhy Model Networks?
7. Graph Data in the Real World
Medium Data: The TraditionBig Data: The Future, Starting Today“Small Data”—Flat File RepresentationsEdgeList Files.net FormatGML, GraphML, and other XML FormatsAncient Binary Format—##h Files“Medium Data”: Database RepresentationWhat are Cursors?What are Transactions?NamesNodes as Data, Attributes as ?The ClassFunctions and DecoratorsDecorator notationThe AdaptorWorking with 2-Mode DataExercises for the ReaderSocial Networks and Big DataNoSQLStructural RealitiesPlain text is kingThe freedom to storeComputational ComplexitiesBig Data is BigBig Data at WorkWhat Are We Distributing?Hadoop, S3, and MapReduceHiveSQL is Still Our Friend
A. Data Collection
A Note on the Ethics of Data CollectionThe Old-Fashioned WayMining Server LogsMining Social Media SitesBusiness and InvestmentsPolitics, Elections, and CourtsBlogosphere and Social BookmarkingTwitter Data CollectionFacebookPrivate Ego-NetworksFacebook Social Graph API

B. Installing Software
Why (We Love) Python?Exploratory ProgrammingPythonIPythonNetworkXmatplotlibpylab: matplotlib with IPython
About the Authors
Copyright

Content preview from Social Network Analysis for Startups

Chapter 4. Cliques, Clusters and Components

In the previous chapter, we mainly talked about properties of individuals in a social network. In this chapter, we start working with progressively larger chunks of the network, analyzing not just the individuals and their connection patterns, but entire subgraphs and clusters. We’ll explore what it means to be in a triad and what benefits and stresses can come from being in a structural hole.

First, we will deconstruct the network by progressively removing parts to find its core(s); then, we’ll re-construct the network from its constituent parts—diads, triads, cliques, clans and clusters.

Components and Subgraphs

To start teasing apart the networks into analyzable parts, let us first make a couple definitions:

A subgraph is a subset of the nodes of a network, and all of the edges linking these nodes. Any group of nodes can form a subgraph—and further down we will describe several interesting ways to use this.
Component subgraphs (or simply components) are portions of the network that are disconnected from each other. Before the meeting of Romeo and Juliet, the two families were quite separate (save for the conflict ties), and thus could be treated as components.

Many real networks (especially these collected with random sampling) have multiple components. One could argue that this is a sampling error (which is very possible)—but at the same time, it may just mean that the ties between components are outside of the scope of the sampling and ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781449311377Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Social Network Analysis for Startups

by Maksim Tsvetovat, Alexander Kouznetsov

Chapter 4. Cliques, Clusters and Components

Components and Subgraphs

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.