How to do it...

Let's take a look at the following steps:

  1. Let's start by listing the chromosomes of the A. gambiae genome:
import gzipfrom Bio import SeqIOgambiae_name = 'gambiae.fa.gz'atroparvus_name = 'atroparvus.fa.gz'recs = SeqIO.parse(gzip.open(gambiae_name, 'rt', encoding='utf-8'), 'fasta')for rec in recs:    print(rec.description)

This will produce the following output:

chromosome:AgamP3:2L:1:49364325:1 chromosome 2Lchromosome:AgamP3:2R:1:61545105:1 chromosome 2Rchromosome:AgamP3:3L:1:41963435:1 chromosome 3Lchromosome:AgamP3:3R:1:53200684:1 chromosome 3Rchromosome:AgamP3:UNKN:1:42389979:1 chromosome UNKNchromosome:AgamP3:X:1:24393108:1 chromosome Xchromosome:AgamP3:Y_unplaced:1:237045:1 chromosome Y_unplaced

The code is quite straightforward. ...

Get Bioinformatics with Python Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.