CHAPTER 4

NEW DEVELOPMENTS IN PROCESSING OF DEGENERATE SEQUENCES

Pavlos Antoniou and Costas S. Iliopoulos

4.1 INTRODUCTION

Degenerate sequences are sequences that have several possible letters in some of their positions. In terms of biological sequences, degenerate sequences can have more than one base or amino acid in some positions. For example, in the DNA sequence AG[CT]ACC[ACT]A, at position 3, we have either C or T, and in position 7 we can have either A, C, or T.

The processing of these degenerate sequences presents problems that have interested researchers because of their direct applications in biology, cryptography, and music. In music, for example, single nodes may match chords. In cryptography, undecoded symbols may match one of a specific set of letters in the alphabet[9].

In computational biology research, degenerate sequences have been used extensively to represent polymorphisms in DNA/RNA sequences. These polymorphisms in coding regions are caused by redundancy of the genetic code or polymorphism in binding sites or plainly by errors and limitations of the sequencing equipment in biological labs. Additionally, biologists have been interested in degenerate sequences especially for the problem of degenerate primer design in polymerase chain reaction (PCR) sequences[17].

4.1.1 Degenerate Primer Design Problem

PCR, is a process that amplifies a specific region of DNA to provide enough copies of that region to be tested or sequenced. To use this PCR process, the biologists ...

Get Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.