
This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
Variations
|
51
Variations
The algorithms as described and implemented earlier are rarely used. Over the years,
important innovations have made the general algorithms more applicable to aligning
biological sequences and running efficiently in a computer.
Gap Modifications
Most alignment algorithms employ affine gaps. The previous description used a sim-
ple gap-scoring scheme in which every letter inserted or deleted cost the same
amount. This isn’t a very good model of biology because large gaps tend to occur
fairly often. To better model this phenomenon, affine gaps are used where there is a
greater penalty for opening a gap than extending the gap. Figure 3-7 shows a graphi-
cal view.
Affine gap penalties are a simple modification to either algorithm. All you need to do
is to record a third value in each cell of the matrix that keeps track of whether a gap
has already been opened or not and then assign the appropriate gap penalty. Some
algorithms even allow multiple gap scoring schemes so that very long gaps are not
penalized as much. You can visualize how this works by following the minimal pen-
alty in Figure 3-7. These algorithms are slightly more complicated because scores for
each affine gap must be tracked. Usually two affine penalties are employed, and the
algorithms are labeled ...