23 Numerical Sequence Alignment

The DNA and protein alignment algorithms that have been thus far discussed relied on representations of the biological information as strings with a specific alphabet, and string manipulation algorithms were required to align the sequences. An alternative approach is to represent the biological information as a vector of numerical values. In doing this a variety of numerical comparison algorithms become available for the purposes of comparing and analyzing data. This chapter will discuss some methods of representing the information with numerical vectors and basic comparison algorithms.

23.1 Alternative Encodings

The focus here will be on several numerical encodings of sequence data. Many of these contain advantages ...

Get Python for Bioinformatics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.