8.1 Introduction

Graphs are highly versatile data structures [1] that are used in a wide range of research areas. Consequently, the comparison of two graphs is a problem with many applications. Examples include cheminformatics [2,3], bioinformatics [4], sociology [5,6], telecommunication (e.g., the internet), computer vision [7], and natural language processing [8].

Typical approaches for comparing two graphs are based on

1. comparing sets of vertices and edges, which is fast (runtime often scales linearly or quadratically in the size of the graphs), but neglects graph topology,
2. mining frequent or discriminative subgraphs, which can be slow (with potentially exponentially many subgraphs to be considered),
3. graph kernels, which constitute a trade-off between runtime and discriminative power.

Another possibility [9] is to categorize graph (dis)similarity measures into those based on (sub)graph isomorphism [10,11] (e.g., Zelinka distance [12], or, distances based on largest common subgraph and smallest common supergraph [13]), graph transformations (e.g., edit distance [14]), adjacency matrices [15], grammars [16], and others. Yet another view is to distinguish between the approaches that construct explicit features, such as graph invariants, and those that perform implicit comparisons, such as graph kernels. The latter are thus a special case of graph similarity measures.

Informally, a graph kernel is a function defined directly on two graphs that corresponds to an inner ...

Get Statistical and Machine Learning Approaches for Network Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.