Skip to Main Content
Spark高级数据分析(第2版)
book

Spark高级数据分析(第2版)

by Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills
June 2018
Beginner to intermediate content levelBeginner to intermediate
246 pages
6h 57m
Chinese
Posts & Telecom Press
Content preview from Spark高级数据分析(第2版)
基因数据分析和
BDG
项目
207
.filter(!_._3.contains("N"))
.map(tup => {
val region = tup._1._1
val label = tup._1._2
val contig = region.referenceName
val start = region.start
val end = region.end
val phylopAvg = tup._2._1
val phylopMin = tup._2._2
val phylopMax = tup._2._3
val seq = tup._3
val pwmScore = scorePWM(seq)
val closestTss = math.min(
distanceToClosest(bTssData.value(contig), start),
distanceToClosest(bTssData.value(contig), end))
val tf = "CTCF"
(contig, start, end, pwmScore, phylopAvg, phylopMin, phylopMax,
closestTss, tf, cellLine, label)}))
内部连接确保我们得到定义完善的特征向量。
提取基因组中与该位点对应的基因组序列,并附加到元组中。
丢弃任何基因组序列模糊的位点。
这里是我们最终建立的特征向量。 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

大数据项目管理:从规划到实现

大数据项目管理:从规划到实现

Ted Malaska, Jonathan Seidman
管理Kubernetes

管理Kubernetes

Brendan Burns, Craig Tracey

Publisher Resources

ISBN: 9787115482525