9 Tandem Repeats

Some regions of the genome contain repeating regions of DNA. In some cases the repeats are quite simple—for example, GATGATGAT. In other cases repeats are much more complicated, with regions repeating minor variations or sets of nested repeating regions. This chapter will explore one method of finding these repeating regions.

9.1 Tandem Repeats

Repeats are consecutive repeating segments in a string. Two simple examples are

 

TCTCTCTCATTCATTCATTC

A compressed format for representing the repeat is to place a subscript for the number of repeats of a substring enclosed in parentheses—for example, TGTGTGTG, which can be written as (TG)4.

Tandem repeats may become much more complex when a region repeats with a minor variation. In an ...

Get Python for Bioinformatics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.