Skip to Content
Bioinformatics with Python Cookbook
book

Bioinformatics with Python Cookbook

by Tiago Antao
June 2015
Intermediate to advanced
306 pages
6h 50m
English
Packt Publishing
Content preview from Bioinformatics with Python Cookbook

Analyzing data in the variant call format

After running a genotype caller (for example, GATK or samtools), you will have a variant call format (VCF) file reporting on genomic variations, such as single-nucleotide polymorphisms (SNPs), Insertions/Deletions (INDELs), copy number variation (CNVs), and so on. In this recipe, we will discuss VCF processing with the PyVCF module.

Getting ready

While next-generation sequencing is all about big data, there is a limit to how much I can ask you to download as a dataset for this book. I believe that 2 to 20 GB of data for a tutorial is asking too much. While the 1000 genomes' VCF files with realistic annotations are in this order of magnitude, we will want to work with much less data here. Fortunately, the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Bioinformatics with Python Cookbook - Second Edition

Bioinformatics with Python Cookbook - Second Edition

Tiago Antao
Machine Learning Using TensorFlow Cookbook

Machine Learning Using TensorFlow Cookbook

Alexia Audevart, Konrad Banachewicz, Luca Massaron

Publisher Resources

ISBN: 9781782175117