Chapter 4. Fake Data Gives Real Answers
What do you do when this happens?
Customer: We have sensitive data and a critical query that doesn’t run at scale. We need expert assistance. Can you help us?
Experts: We’d be happy to. Can we log into your machine?
Customer: No.
Experts: Can you show us your data?
Customer: No.
Experts: How about a stack trace?
Customer: No.
Customer: Can you help us?
At first glance, it may seem as though the customer in this scenario is being unreasonable, but they’re actually being smart and responsible. Their data and the details of their project may indeed be too sensitive to be shared with outside experts like you, and yet they do need help. What’s the solution?
A similar problem arises when someone is trying to do secure development of a machine learning system. Machine learning requires an iterative approach that involves training a model, evaluating performance, tuning the model, and trying the process over again. It’s often not a straightforward cookbook process, but instead one that requires the data scientist to have a good understanding of the data. The data scientist must be able to interpret the initial results produced by the trained model and use this insight to tweak the knobs of the right algorithm to improve performance and better model reality. In these situations, the project can often benefit from the experience of outside collaborators, but getting this help can be challenging when there is a security perimeter protecting the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access