CHAPTER 6

ALGORITHMIC ASPECTS OF ARC-ANNOTATED SEQUENCES

Guillaume Blin, Maxime Crochemore, and Stéphane Vialette

6.1 INTRODUCTION

Structure comparison for RNA has become a central computational problem bearing many computer science challenging questions. Indeed, RNA secondary structure comparison is essential for (i) identification of highly conserved structures during evolution (which always cannot be detected in the primary sequence because it is often unpreserved), which suggests a significant common function for the studied RNA molecules, (ii) RNA classification of various species (phylogeny), (iii) RNA folding prediction by considering a set of already known secondary structures, and (iv) identification of a consensus structure and consequently of a common role for molecules.

From an algorithmic point of view, RNA structure comparison first was considered in the framework of ordered trees [21]. More recently, it also has been considered in the framework of arc-annotated sequences [10]. An arc-annotated sequence is a pair (S,P), where S is a sequence of RNA bases and P represents hydrogen bonds between pairs of elements of S. From a purely combinatorial point of view, arc-annotated sequences are a natural extension of simple sequences. However, using arcs for modeling nonsequential information together with restrictions on the relative positioning of arcs allow for varying restrictions on the structure of arc-annotated sequences.

Different pattern matching and motif search ...

Get Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.