Chapter 9

Network Motif Discovery

9.1 Introduction

A network motif of a graph G is a subgraph of G that appears signiﬁcantly more

than expected in G. Formally, network motif is a subgraph that exists with a much

higher frequency in G than expected in a similar random graph to G. A motif of

size k is called a k-motif. Milo et. al. were ﬁrst to discover these over-represented

subgraphs in the networks they studied and proposed the network motifs as the basic

building blocks of complex networks [20]. Several motifs have been shown to have

functional signiﬁcance in biological networks such as the PPI networks [3] and the

gene regulating networks [22] some of which are shown in Figure 9.1.

The feed-forward loop motif for example, was shown to have information ﬁl-

tering capabilities during the gene regulation process. Network motifs may exist in

undirected graphs as in the PPI networks or in directed graphs such as the gene-

regulating networks. Network motif search is NP-hard, and therefore heuristic al-

gorithms are frequently used. This problem is closely related to the subgraph iso-

morphism where we search the isomorphic occurrences of a small graph in a large

graph.

In this chapter, we ﬁrst describe the network motifs in detail and analyze metrics

for their signiﬁcance. We then look into the related problem of subgraph isomor-

phism and describe fundamental algorithms that are used for subgraph isomorphism.

However, network motif discovery is not only ﬁnding an isomorphic subgraph of a

graph as we need to asses the number of occurrences of this subgraph to consider it

as a motif. In other words, we need to ﬁnd all occurrences of the motif. Finding all

existing motifs of a given size is called census and we investigate exact and approxi-

mate census algorithms in Section 1.4.

169

170

Complex Networks: An Algorithmic Perspective

(a)

(b)

(c)

Figure 9.1: Motifs in biological networks. a) Feed-forward loop. b) A single input

motif. c) A multi-input motif

9.2 Network Motifs

A motif is a connected and usually a small graph m with k number of vertices called

its size. A match G

′

of a motif in the target graph G is a subgraph of G which is

isomorphic to m. Two possible undirected motifs of size three are shown in Figure 9.2

and Figure 9.3 displays all possible thirteen directed motifs with the size 3. Network

motif search and analysis in general are concerned with the more difﬁcult problem

of directed networks.

The number of occurrences of a motif m in the graph G is called its f requency

and there are three variations of this frequency as F

1

, F

2

and F

3

in G [27]. The F

1

frequency is related to all matchings, including overlapping ones of m in G; F

2

is

the frequency of edge disjoint matchings, and F

3

is the frequency of edge and ver-

tex disjoint matchings. Figure 9.4 displays the feed-forward loop motifs in a small

network. There are four occurrences of this motif and subgraphs m

1

and m

4

are ver-

tex and edge disjoint whereas m

1

, m

2

, m

3

share vertex a; m

2

, m

1

, m

2

share vertex

f; and m

2

, m

3

and m

4

share vertex e. Subgraphs m

1

and m

2

share the edge (a, f )

and m

2

and m

3

share the edge (a,e). Examining Figure 9.4, we ﬁnd F

1

as 4 which

includes all matchings m

1

,m

2

,m

3

,m

4

; F

2

as 3 for matchings m

1

,m

3

,m

4

, and F

3

as 2

for matchings m

1

and m

4

.

Finding the maximum number of unique, nonoverlapping matches of a motif (F

3

)

is equivalent to the maximum independent set problem we have seen in Section 6.2.

(a)

(b)

Figure 9.2: Undirected network motifs of size 3. a) A triad. b) A triangle

Network Motif Discovery

171

Figure 9.3: All thirteen directed network motifs of size 3

(a)

(b)

a

g

f

a

ef

a

e

b

d

e

c

(c)

a

g

d

ef

b

c

m1

m2

m4

m3

Figure 9.4: Motifs in a graph. a) The feed-forward loop motif. b) The target graph.

c) Motifs m

1

,m

2

,m

3

and m

4

discovered

As this problem is NP-hard [6], ﬁnding F

3

of a motif m in a graph G is also NP-

hard which necessitates the use of heuristics to compute a lower bound for F

3

. Table

9.1 shows the number of possible motifs of sizes 3,..,10 in undirected and directed

graphs. Clearly, the number of directed graphs grows much faster than the number

of undirected graphs for the same number of vertices.

Table 9.1: Number of Possible Network Motifs

Motif size 3 4 5 6 7 8 9 10

Undirected subgraphs 2 6 21 112 853 ≈ 10

4

≈ 10

5

≈ 10

7

Directed subgraphs 13 199 9364 10

6

≈ 10

9

≈ 10

12

≈ 10

16

≈ 10

20

Start Free Trial

No credit card required