Chapter 3. Reverse Complement of DNA: String Manipulation
The Rosalind REVC challenge explains that the bases of DNA form pairs of A-T and G-C. Additionally, DNA has directionality and is usually read from the 5'-end (five-prime end) toward the 3'-end (three-prime end). As shown in Figure 3-1, the complement of the DNA string AAAACCCGGT is TTTTGGGCCA. I then reverse this string (reading from the 3'-end) to get ACCGGGTTTT as the reverse complement.
Although you can find many existing tools to generate the reverse complement of DNAâand Iâll drop a spoiler alert that the final solution will use a function from the Biopython libraryâthe point of writing our own algorithm is to explore Python. In this chapter, you will learn:
-
How to implement a decision tree using a dictionary as a lookup table
-
How to dynamically generate a list or a string
-
How to use the
reversed()
function, which is an example of an iterator -
How Python treats strings and lists similarly
-
How to use a list comprehension to generate a list
-
How to use
str.maketrans()
andstr.translate()
to transform a string -
How to use Biopythonâs
Bio.Seq
module -
That the real treasure is the friends you make along the way
Getting Started
The code and tests for this program are in the 03_revc directory. To get a feel ...
Get Mastering Python for Bioinformatics now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.