CHAPTER 6

STRUCTURAL SEARCH IN RNA MOTIF DATABASES

DONGRONG WEN AND JASON T. L. WANG

6.1 INTRODUCTION

Ribonucleic acid (RNA) is transcribed from deoxyribonucleic acid (DNA) and plays a key role in the synthesis of proteins [1]. An RNA structural motif is a substructure of an RNA molecule that has a significant biological function. Well-known RNA structural motifs include the iron response element (IRE) and histone 3′-UTR stem-loop (HSL3) [2,3]. As increasingly more RNA structural motifs are discovered, it becomes crucial to have databases holding the motifs that can be accessed and used by researchers. For example, Rfam [4] and RNA STRAND [5] are two such databases.

Rfam is a well-annotated, open access database containing information on noncoding RNA (ncRNA) families as well as other RNA structural motifs. The latest version of Rfam 9.0, comprising 603 families in total, is available at http://rfam.sanger.ac.uk/. In Rfam, each ncRNA family is represented by two structure-annotated multiple sequence alignments (MSAs). One MSA is called the seed alignment and the other is called the full alignment. Each multiple sequence alignment is associated with a consensus secondary structure, represented in Stockholm format [6,7]. The seed alignment consists of functionally related RNA sequences obtained from the literature or wet lab experiments.

The seed alignment is used to build a covariance model used by the Infernal program [6] to collect additional functionally related RNA sequences. ...

Get Computational Intelligence and Pattern Analysis in Biological Informatics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.