Skip to Content
Bioinformatics with Python Cookbook - Second Edition
book

Bioinformatics with Python Cookbook - Second Edition

by Tiago Antao
November 2018
Intermediate to advanced
360 pages
9h 36m
English
Packt Publishing
Content preview from Bioinformatics with Python Cookbook - Second Edition

Getting ready

While NGS is all about big data, there is a limit to how much I can ask you to download as a dataset for this book. I believe that 2 to 20 GB of data for a tutorial is asking too much. While the 1,000 Genomes' VCF files with realistic annotations are in this order of magnitude, we will want to work with much less data here. Fortunately, the Bioinformatics community has developed tools to allow for the partial download of data. As part of the SAMtools/htslib package (http://www.htslib.org/), you can download tabix and bgzip, which will take care of data management. On the command line, perform the following:

tabix -fh ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/supporting/vcf_with_sample_level_annotation/ALL.chr22.phase3_shapeit2_mvncall_integrated_v5_extra_anno.20130502.genotypes.vcf.gz ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

Tiago Antao

Publisher Resources

ISBN: 9781789344691Supplemental Content