CHAPTER 27

MAXIMUM ENTROPY METHOD FOR COMPOSITION VECTOR METHOD

Raymond H.-F. Chan, Roger W. Wang, and Jeff C.-F. Wong

27.1 INTRODUCTION

In the past few decades, a large volume of molecular sequences has been collected, from which the evolution and traits of the related living organisms are investigated. These sequences all look simple; for instance, the DNA sequence, no matter how long it is, contains only four different nucleotides A, C, G, and T, so it is not surprising that on the surface, these sequences themselves cannot tell us much. To reveal the hidden information, the use of the so-called sequence comparison is an essential tool. Sequence comparison methods can be divided into two main categories: alignment-based [15, 17, 36, 37, 42, 50] and alignment-free [25, 28, 40, 43, 52].

The alignment-based methods use the dynamic programming (DP) method to “align" the sequences and then find the similarity and dissimilarity after the alignment. To compare two sequences of length n by any alignment-based method, both the computational cost and the memory requirement are [54]. Because of the accuracy of the DP method, the alignment-based methods are used widely for analyzing gene sequences. However, different gene sequences may give different evolutionary results. For instance, based on the 16rRNA sequences, birds, which more closely are related to crocodilians, were grouped with ...

Get Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.