CHAPTER 4
NEW DEVELOPMENTS IN PROCESSING OF DEGENERATE SEQUENCES
4.1 INTRODUCTION
Degenerate sequences are sequences that have several possible letters in some of their positions. In terms of biological sequences, degenerate sequences can have more than one base or amino acid in some positions. For example, in the DNA sequence AG[CT]ACC[ACT]A, at position 3, we have either C or T, and in position 7 we can have either A, C, or T.
The processing of these degenerate sequences presents problems that have interested researchers because of their direct applications in biology, cryptography, and music. In music, for example, single nodes may match chords. In cryptography, undecoded symbols may match one of a specific set of letters in the alphabet[9].
In computational biology research, degenerate sequences have been used extensively to represent polymorphisms in DNA/RNA sequences. These polymorphisms in coding regions are caused by redundancy of the genetic code or polymorphism in binding sites or plainly by errors and limitations of the sequencing equipment in biological labs. Additionally, biologists have been interested in degenerate sequences especially for the problem of degenerate primer design in polymerase chain reaction (PCR) sequences[17].
4.1.1 Degenerate Primer Design Problem
PCR, is a process that amplifies a specific region of DNA to provide enough copies of that region to be tested or sequenced. To use this PCR process, the biologists ...