CHAPTER 12

ALGORITHMS FOR THE ALIGNMENT OF BIOLOGICAL SEQUENCES

Ahmed Mokaddem and Mourad Elloumi

12.1 INTRODUCTION

Bioinformatics is a science dedicated to the automatic processing of information related to biological macromolecules (i.e., DNA, RNA, and proteins). These macromolecules are coded by strings called biological sequences. Every character in a string codes a constituent of the macromolecule. DNA, RNA, and proteins can be coded by sequences in which every character is in {A, T, C, G}, {A, U, C, G}, and {A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y}, respectively. Among the most studied problems in bioinformatics is the comparison of biological sequences in order to identify similar substrings, occuring in the same order, in these sequences. This operation makes a very important contribution in the analysis of biological macromolecules. In fact, it can reveal information about shared functions of biological macromolecules, coming from several different organisms, by the identification of regions that are shared by the sequences coding these macromolecules. These regions, which have been conserved during evolution, often play an important structural or functional role and, consequently, shed light on the mechanisms and the biologic processes in which these macromolecules participate. In addition, the comparison of biological sequences permits the detection of functional regions. It is also used in evolutionary studies to analyze relationships that exist ...

Get Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.