Skip to Content
Bioinformatics with Python Cookbook
book

Bioinformatics with Python Cookbook

by Tiago Antao
June 2015
Intermediate to advanced
306 pages
6h 50m
English
Packt Publishing
Content preview from Bioinformatics with Python Cookbook

Traversing genome annotations

Having a genome sequence is interesting, but we will want to extract features from it: genes, exons, and coding sequences. This type of annotation information is made available in GFF and GTF files. GFF stands for Generic Feature Format. In this recipe, we will see how to parse and analyze GFF files, using the annotation of the Anopheles gambiae genome as an example.

Getting ready

We will use the gffutils library to process the annotation file.

If you do not use the notebook, you need to acquire the annotation file from our datasets page at https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Datasets.ipynb (file gambiae.gff3.gz) Rename the annotation file as gambiae.gff.gz. Preferably, use the 02_Genomes/Annotations.ipynb ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Bioinformatics with Python Cookbook - Second Edition

Bioinformatics with Python Cookbook - Second Edition

Tiago Antao
Machine Learning Using TensorFlow Cookbook

Machine Learning Using TensorFlow Cookbook

Alexia Audevart, Konrad Banachewicz, Luca Massaron

Publisher Resources

ISBN: 9781782175117