Skip to Content
机器学习流水线实战
book

机器学习流水线实战

by Hannes Hapke, Catherine Nelson
November 2021
Intermediate to advanced
302 pages
8h 57m
Chinese
Posts & Telecom Press
Content preview from 机器学习流水线实战
机器学习的数据隐私
235
问题
事实
随机选择
14-1:随机回复流程图
这些随机转换是
DP
的关键。
假设每人一个训练样本
为简单起见,本章会假设数据集中的每个训练样本都与一个人相关或是从一
个人那里收集到的。
14.2.1
 局部差分隐私和全局差分隐私
DP
可以分为两种主要方法:局部
DP
和全局
DP
。在局部
DP
中,如先前的随机回复案例
所示,在个人级别添加了噪声或随机性,因此在个人与数据采集者之间保持了隐私。在全
DP
,噪声会被添加到整个数据集的转换中。数据采集者与原始数据是可以信任的,
但是转换结果不会显示有关个人的数据。
与局部
DP
相比
,全局
DP
只要求添加更少的噪声,这可以提高查询的实用性或准确率,
从而获得相似的隐私保证。不利之处在于,数据采集者必须被全局
DP
信任
,而对于局部
DP
,只有一个个用户看得到他们自己的原始数据。
14.2.2
 
epsilon
delta
和隐私预算
实现
DP
的最常见方法可能是用
ε
δ
(epsilon–delta)DP
ε
。当比较包含某一人的数据集上的随
机转换结果与不包含该人的另一结果时,
e
ε
描述了这些转换结果之间的最大差异。因此,
如果
ε
0
,说明两个转换会返回完全相同的结果
。如果
ε
的值较小,则转换将返回相同
结果的可能性更大——
ε
的值越小越隐秘,因为
ε
会衡量隐私保证的强度。如果多次查询
数据集,则需要对每次查询的
ε
求和,以获取总的隐私预算。
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

大数据项目管理:从规划到实现

大数据项目管理:从规划到实现

Ted Malaska, Jonathan Seidman
可编程网络自动化

可编程网络自动化

Jason Edelman, Scott S. Lowe, Matt Oswalt
C++语言导学(原书第2版)

C++语言导学(原书第2版)

本贾尼 斯特劳斯特鲁普

Publisher Resources

ISBN: 9787115573216