As a software security evaluator and a one-time engineer, I can confirm what the daily security breaches are telling us: software engineers and architects regularly fail at building in sufficient security and privacy. As someone who has been on both sides of this table, I’d like to share some of my own security-related engineering sins and provide some practical advice for both engineers and security officers on how best to balance development goals with privacy concerns.
I started programming many years ago, working in a role where I created artificial intelligence software for data analysis. My team and I built innovative software solutions for predicting behavior, such as programs that could aid in preventing crimes. As enthusiastic engineers, we were so focused on building something cool and of great value that we tended to overlook the security risks in our programs, profoundly annoyed when privacy officers said “no” to what we wanted to do. This behavior typically resulted in unofficial implementations with unfortunate privacy anti-patterns (nicknamed in capitals): COLLECTTOOMUCH, KEEPTOOLONG, BADSECURITY, and SCATTER, the last of which refers to storing data elsewhere without keeping it up to date. I also saw LEAKEXTERNAL, which came up when home addresses were sent to an external online service for drawing maps. This mentality of prizing visible features over invisible risks, while somewhat understandable, is among the chief causes of the privacy problems that plague engineering.
Now, as a software security evaluator, I see that sometimes even the simplest data protection is missing from programs, which highlights that the problem with building in security and privacy is not complexity, per se—it’s our habit as engineers to work hard on what is emphasized and visible. We are driven by the immediate business value of features and data, so we build features ASAP and collect as much data as we can. We tend to put our heads in the sand when it comes to the misery of our users whose data may leak from our systems, because after collecting it, we often forget about protecting it.
When I was an engineer, I personally experienced how feature enthusiasm can be blinding. But now I am on the other side of the table, as a privacy and security advisor to software makers. Based on my own experiences, I will not immediately turn down engineers’ requests, but instead I will ask them to consider what they want to accomplish. In many situations, we can collect beneficial data while also safeguarding users’ security and privacy. For example, in the case of the external online service that received users’ addresses, the complete home addresses didn’t need to be sent externally to draw a map—just the zip codes provided sufficient detail for the service’s purpose and sufficient privacy for users. Another, more universal example is the collection of web browsing behavior: by storing such information with a pseudonym rather than with users’ personal details, companies can analyze the information to better understand user behavior without the risk of leaking privacy-sensitive data.
A balance can exist between development goals and privacy and security concerns. My advice to data-driven engineers is to be careful, think about how much data you really need, and don’t get greedy. When you handle it, you need to think about the what-ifs: can I safely store this information here? Can we securely send this data over there? Shouldn’t we inform users or ask consent about this type of use? Remember, asking these questions now will save you and your users many headaches later. To security and privacy decision-makers, I’d like to suggest that you always see yourself as a partner of the engineers. Help them consider the issues and seek solutions, and avoid becoming a block to your own business outcomes.
Software teams and security teams must work together to solve problems with software security and privacy. Software makers need to be able to see the issues they introduce and, instead of just being stopped, get guidance from security experts in finding the right balance between data risk and data value.
This post is a collaboration between O'Reilly and SIG. See our statement of editorial independence.