Skip to Content
云端基因组学
book

云端基因组学

by Geraldine A. Van der Auwera, Brian D. O’Connor
April 2022
Beginner to intermediate
486 pages
10h 22m
Chinese
China Electric Power Press Ltd.
Content preview from 云端基因组学
Pipelines API
运行多个工作流
305
10.3.3
成本优化建议
博德研究所团队是如何节约工作流成本的,你若对此好奇,想探索其实现,回头看
看第
9
章再好不过。我们一直在讨论的这条经优化的流水线,正是我们上一章拆解
的第二个工作流(含子工作流和任务库的那个)。
我们就不拘泥于细节,下面总结三个最有效的
WDL
工作流优化策略。合理使用
它们,可降低你在谷歌云上的运行成本。具体代码请见我们第
9
章研究的任务库
GermlineVariantDiscovery.wdl
动态调整资源配额
对于给定任务,你请求的虚拟机存储越多,成本就越高。任务最少需要多大硬盘存储,
你就请求多少,可保证成本处于低位,但输入文件大小不一该如何应对,如何省去
手动检查每个待处理样本?不用担心,一些
WDL
函数可帮你在运行时(但早于请
求虚拟机)计算任务的输入文件的大小。然后你可计算应分配多大硬盘(根据任务
输出多大文件,合理假定)。例如,下面代码先计算参考基因组大小,接着用
BAM
输入文件(稍后再多讲一讲)的部分大小,加上参考文件的大小,再加上为输出预
留的富余空间,最终确定实际可能需要多大硬盘:
Float
ref_size = size(ref_fasta, "GB") +
size(ref_fasta_index, "GB") + size(ref_dict, "GB")
Int
disk_size = ceil(((size(input_bam, "GB") + 30) / hc_scatter) + ref_size) ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

What Successful Project Managers Do

What Successful Project Managers Do

W. Scott Cameron, Jeffrey S. Russell, Edward J. Hoffman, Alexander Laufer
How to Overcome a Power Deficit

How to Overcome a Power Deficit

Cyril Bouquet, Jean-Louis Barsoux
The Human Factor in AI-Based Decision-Making

The Human Factor in AI-Based Decision-Making

Philip Meissner, Christoph Keding

Publisher Resources

ISBN: 9787519864422