CHAPTER 32
COMPARATIVE GENOMICS: ALGORITHMS AND APPLICATIONS
32.1 INTRODUCTION
“Comparative genomics” is a constantly evolving term that refers to intra- or interspecies comparisons. It involves but is not restricted to the research areas of ortholog assignment, synteny detection, gene cluster detection, multiple genome alignment, rearrangement analysis, ancestral genome reconstruction, and gene or speciation tree construction. We consider the first three as the “upstream” problems, whereas the rest as the “downstream” problems because the solutions of the former typically are used as inputs to the latter. However, the advance of new algorithmic approaches makes this distinction more ambiguous as they can consider multiple problems concurrently.
The long-term goal of comparative genomics is to understand evolution. From a microlevel, researchers want to understand how nucleotide sequences evolve (e.g., rate of point mutations), and from a macrolevel, how genes, regulatory elements, and networks, or metabolic pathways evolve (e.g., gene insertions, fusions, the functional change of pathway resulting from gene losses, etc.) More specifically, thanks to evolution, comparing multiple genomes better reveals functional elements in a genome [36, 18]: the discovery of genes and other functional units, such as transcription factor biniding sites, siRNAs, and so on. Because functional elements tend to be subjected to negative selection, conserved elements among ...