Skip to Content
Tableau Prep即学即用
book

Tableau Prep即学即用

by Carl Allchin
August 2022
Beginner to intermediate
463 pages
9h 22m
Chinese
China Electric Power Press Ltd.
Content preview from Tableau Prep即学即用
422
48
如果你有
1000
万客户,并持有去年的月度快照,那么突然间你会发现需要分析的是
1.2
亿条数据记录。在运行计算或渲染可视化时,该数据集的大小很可能会减慢你的
分析速度。一种可能的方法是,在创建分析流程时,可以考虑将历史表中的
1.1
亿
行数据汇总到一个较低的粒度级别,减少每月的频率。另一种可能的方法是,停止
以每个客户为基础对数据集进行分析,而是形成具有类似人口统计学或产品持有量
的客户群。
维护一个独立的历史表的好处是,你可以对其进行不同的处理,然后只对相关数据
进行连接。与从历史表中删除数据,然后将其连接到实时数据中相比,从实时数据
集中删除用于分析的数据会大大增加犯错的风险,从而影响你的分析。
48.4
数据法规
关于历史表的主要考虑是要尊重数据使用的相关法律。当客户或员工离开时,他们
有权要求删除其数据。然而,一个匿名客户拥有某些产品或进行某些交易的记录是
有用的。只要这些信息不是个人身份识别信息,你就可以保留这类记录,否则就应
该删除(有些行业对此有不同的规定)。当数据汇总到比单个交易更高的层次时,
这些信息的保存就会变得更加安全。
48.5
历史表示例
在编写历史表的时候,有很多注意事项。
首先,数据如何更新?让我们用一个在很多公司都常见的例子——记录投诉。图
48-1
展示了一个典型业务系统中通常如何保存投诉记录的表格。
Extract Date
(提取
日期)字段在许多内部报告中用于显示数据是何时从系统中获取的。
48
-
1
:投诉数据集样本
423
使用历史表
这是一个看似简单明了的数据集。复杂之处在于这个表如何更新。图 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

深度学习:核心原理与案例分析

深度学习:核心原理与案例分析

Posts & Telecom Press, Ahmed Menshawy
Python金融实战

Python金融实战

Posts & Telecom Press, Yuxing Yan
Python机器学习案例精解

Python机器学习案例精解

Posts & Telecom Press, Yuxi (Hayden) Liu
HBase管理指南

HBase管理指南

Posts & Telecom Press, Yifeng Jiang

Publisher Resources

ISBN: 9787519864439