17Phi Divergence and Consistent Estimation for Stochastic Block Model

Here, we consider the problem of clustering and inference in stochastic block models, which are notably applied to social networking and biology. For those models of random graphs, most of the literature proposes methods based on the maximization of the likelihood. However, the maximum likelihood approach is known to be non-robust to misspecification. To address this issue, we introduce new criteria. They are based on divergences in the sense of Csiszaár. These criteria allow us to estimate the group structure of the graph as well as to estimate the parameters of the model. We show the convergence of the new criteria under the weighted stochastic block model with some assumptions. We also study the robustness properties of our estimators in the presence of misspecification and outliers in edge values. Moreover, we provide simulations to support our theoretical results. Eventually, we present an application on a real dataset.

17.1. Introduction

For several years, there has been great interest in modeling data with random graphs. One of the most popular models is the stochastic block model (SBM). This model is used in various fields such as social networking and biology (see, for example, Girvan and Newman (2002) and Newman (2006)). Because of its many applications, inference under this model has been studied by many. Only recently, has there been interest in the question of whether or not the data really comes ...

Get Data Analysis and Related Applications 3 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.