Recognizing and evaluating scientific claims in security

Five questions for Josiah Dykstra on techniques to expose and invalidate misleading claims.

By Courtney Allen and Josiah Dykstra

September 20, 2017

Fish (source: Jennifer42)

I recently sat down with Josiah Dykstra, Senior Security Researcher at the Department of Defense, to discuss the topics of both accidental and intended misleading communications in security, common pitfalls made in evaluating scientific claims, and the questions you should ask when evaluating scientific claims and third-party vendor solutions.

What are some basic tips for recognizing and understanding scientific claims in security marketing, journalism, or other security-related materials?

People and companies use a variety and spectrum of truly scientific, possibly-scientific, and unscientific statements to talk about products and services. Some are trying to persuade you to buy something, others are simply trying to communicate information. Though scientists themselves can produce misleading and manipulative results, I am generally more concerned about the potential damage caused from seemingly scientific-sounding claims by other sources.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

In this context, “scientific claims” are those that could be, or appear to be, based on the scientific method. This happens in part because of deliberate word choice and linguistic ambiguity, particularly with short statements used to grab your attention. Here are a few examples of claims that imply supporting data or science-based discovery:

“Our product significantly outperforms our peers”
“Product X stops known and unknown malware with greater than 99.5% efficacy”
“Seven out of ten websites we scanned on the Internet are vulnerable to XYZ attack”

Security professionals should employ healthy scientific skepticism and seek further clarification when confronted with comparisons, performance, evaluation, and when results seem too good to be true.

What are some common pitfalls that defenders should be aware as they consider the veracity of claims made by vendors, media, or others?

We often forget that we’re all fallible human beings. Human factors, including cognitive bias, play a big part in how we process and evaluate claims. Try to be aware of situations when a sales pitch is appealing to emotion and not your rational evaluation. Learn about different types of cognitive bias, especially confirmation bias and fundamental attribution error.

Surveys, which are common in infosec, are a fertile ground for abuse. Among the red flags you should watch for are surveys that fail to disclose the survey methodology, sample size, and margin of error. These details may be hidden in small print and difficult to locate.

What should defenders keep in mind when reviewing graphics and visualizations specifically?

Graphics and visualizations can certainly help us to understand information more quickly and efficiently. Research shows, however, that people are sometimes persuaded by even trivial graphs because those visualizations “seem” scientific. Defenders should be aware of this cognitive tendency, and take time to understand and assess how a graphic is being used and what it actually shows.

Be aware that the human brain is better and worse at understanding different types of graphs. For example, pie charts can be particularly challenging for the brain to interpret (consider bar charts instead). Even the choice of colors in a visualization can carry cultural symbolism. For example, red is conveys good luck in China but caution in the United States.

What questions should defenders ask as they assess claims or results provided by vendors or peers?

Here are three questions I frequently use when evaluating a new claim:

Who conducted and/or paid for the work? Compared with evaluations or results that came from the vendor, independent analysis and results are less prone to seeking evidence that simply confirms a desired result.
What was the size and composition of the data sample? I have low trust and confidence in results and claims based on representatively small sample sizes (e.g. anti-malware tests using 20 samples when there are millions of malware files in existence) or selectively-chosen data points – both of which my over (or under) inflate the true claim.
What do the adverbs and adjectives mean? Statements that use subjective, imprecise words such as “significantly” or “substantially” might carry a different meaning or judgement than my own.

You’re teaching a tutorial on She Blinded Me With Science: Understanding Misleading, Manipulative, and Deceptive Cybersecurity at the O’Reilly Security Conference in New York this October. What other presentations are you looking forward to attending while there?

I’m really looking forward to two talks with different perspectives on the human aspects of security: Jessy Irwin’s “It’s Us, Not Them: Exploring the Weakest Links in Security” and Chester Wisniewski’s “Embracing security as a culture.” As a computer scientist, I’ve learned to appreciate and seek knowledge about the often-overlooked complexity and impact of human strengths and weaknesses in developers, designers, administrators, adversaries, and yes, users.

Post topics: Security