
This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
38
|
Chapter 2: Biological Sequences
genetic element or selfish DNA (a phrase coined by Francis Crick). These entities are
a bit like the fleas and ticks of the genome: they copy and spread themselves within
and between genomes and are generally believed to do little for the host genome.
Selfish DNAs are usually further classified into three subcategories: transposons, ret-
roviruses, and retrotransposons. If you see these names in a BLAST report, you may
need to use a repeat filter.
Pseudogenes
One of the most confounding problems in similarity searches is the presence of
pseudogenes. As the name suggests, pseudogenes are “fake genes”; that is, they look
like they could encode a protein, but they aren’t functional. Pseudogenes come from
a variety of sources. A mutation that introduces a stop codon into a gene creates a
pseudogene, but more commonly, pseudogenes are created from some kind of dupli-
cation event. Sometimes, through various mechanisms, regions of a chromosome
may become duplicated. The extra copies of genes are generally free of selective pres-
sures and may become pseudogenes as they accumulate mutations. Duplication may
also result from repetitive elements that include neighboring DNA as they copy
themselves into new locations. In eukaryotes, a very common form of pseudogene ...