Errata

Genomics in the Cloud

Errata for Genomics in the Cloud

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
ePub Chapter 2 → Intro to Genomics → The Gene as a ... → 2nd paragraph

Text reads:
"...such as Charles “Origin of Species” Darwin and Gregor “Pea Enthusiast”genes by Wilhelm Johannsen in the early 20th century."

It seems as though there is text excised starting where Mendel's last name should be and continuing through to the thought suffixed by 'genes by Wilhelm...'

loki der quaeler  Jul 13, 2021 
Printed Page 20
last paragraph

The text reads
"hundreds of millions of very short sequences for a single genome"
"about 40 million sequences of ~200 bases, which amounts a 100 GB file per genome"

40 M * 200 B = 8000 MB = 8 GB

I think 40 should be 400, which results in 80 GB, ~100 GB.

Jongsu Kim  Oct 10, 2021 
169 approx
Working through the Safari Books Online material so paras do not correspond to page

The command below (at end of comment) will not run:
file paths are not specified and some files are missing? eg. -R reference.fasta was changed to ref/ref.fasta based on what is in the docker container but other files were missing eg. hapmap_sites.vcf.gz etc. What should the correct command be and/or how/where to get the missing files? Thanks
gatk VariantRecalibrator \
-R reference.fasta \
-V jointcalls_hc.vcf.gz \
--resource:hapmap,known=false,training=true,truth=true,prior=15.0 \
hapmap_sites.vcf.gz \
--resource:omni,known=false,training=true,truth=false,prior=12.0 \
1000G_omni2.5.sites.vcf.gz \
--resource:1000G,known=false,training=true,truth=false,prior=10.0 \
1000G_phase1.snps.high_conf.vcf.gz \
--resource:dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp.vcf.gz \
-an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR \
-mode SNP \
-O output.recal \
--tranches-file output.tranches

Pam  Dec 22, 2021 
ePub Page 236
Introducing Scatter-Gather Parallelism

There is an error in the wdl file named `scatter-haplotypecaller.wdl` in the chapter Introducing Scatter-Gather Parallelism. The wdl file shown there does not work as the definition of `merged_vcf = "${merged_vcf_name}` is named as merged_vcf instead of mergedGVCF. Changing the name to `mergedGVCF` makes the wdl work otherwise the given wdl file fails when we try to validate it with womtool.




SAAD MURTAZA Khan  Dec 15, 2021