Chapter 24

Finding repetitions in biological networks: challenges, trends, and applications

SIMONA E. ROMBO

24.1 Introduction

For many years, analysis of biological sequences (also termed biosequences) associated with proteins and genomes played a key role in understanding the mechanisms inside the cell [7, 47, 49]. After the genome coding of several organisms was completed [13], significant attention began to focus on studying how cellular components interact with each other to accomplish the biological functions of the cell [59].

While biosequences are usually represented by strings defined on a finite alphabet, where symbols are associated with amino acids or nucleotides, interaction data can be instead modeled by graphs, called biological networks, where nodes represent components and edges their interactions. The set of all the protein–protein interactions (PPIs) of a specific organism represents its interactome.

Despite the different models adopted to analyze biological data, there is a common peculiarity characterizing them; this is termed their intrinsic repetitiveness. The presence of repetitions can be considered biologically interesting in many cases, for example, when the presence of repeted elements is discovered among cells belonging to different organisms, or if a specific feature appears several times in the same cell. Often suitable statistical indices can be usefully exploited in order to characterize the significativeness of the repetitions found [51].

Searching ...

Get Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.