Alistair Croll

Strata Online Conference: Data Warfare

Data is our foundation. What happens when it's under attack?

Date: This event took place live on January 22 2013

Presented by: Alistair Croll

Duration: Approximately 120 minutes.

Cost: Free

From public policy to elections, from healthcare to the battlefield, our lives rely on the analysis of abundant, connected data. But if data is infrastructure, then that infrastructure's vulnerable. Enemies can confound, confuse, distort, and mislead by attacking the information we collect and the ways we analyze.

In this online event, we'll look at data under attack. We'll see how data fights crime-and how it might abuse innocence. We'll look at sock puppets, identity fraud, and Internet fakery; and we'll look at some of the ways we can fend off attack. Join us for a free Strata online event as we look at the downside of data dependency, and the coming Data War.


Stacks get hacked
Alistair Croll

The history of communications technology has been open, interoperable stacks, built by shameless idealists, that get attacked by bad guys. From denial-of-service attacks to SYN floods to spam to phishing, most of today's cyber-security exploits capitalize on the inherent openness of the Web.

But Big Data is at the top of the stack. It's where technology meets people. If we're living in a data-driven world, then it's one that's easily hijacked. Bad data can undermine machine learning, misinform, and obfuscate. And while that doesn't sound dangerous, it's enough to sway an election or cripple a market.

In this opening session, Strata chair Alistair Croll will look at what happens when stacks get attacked, why it's inevitable, and the coming data arms race.

About Alistair Croll

Alistair Croll has been an entrepreneur, author, and public speaker for nearly 20 years. He's worked on a variety of topics, from web performance, to big data, to cloud computing, to startups, in that time.

In 2001, he co-founded web performance startup Coradiant, and since that time has also launched Rednod, CloudOps, Bitcurrent, Year One Labs, the Bitnorth conference, the International Startup Festival and several other early-stage companies.

Alistair is the author of three books on web performance, analytics, and IT operations, and is currently working on a forthcoming book about data-driven startups. He lives in Montreal, Canada and tries to mitigate chronic ADD by writing about far too many things at Solve For Interesting.


Sex. Drugs. Rock. And CODE: Hacking Cybersecurity
Christina Gagnier

Cybersecurity and "hacking" have become recent topics in the mainstream media. While some hacking is seemingly innocuous pranks, serious electronic breaches have threatened the security of major organizations. Developing appropriate responses to cybersecurity threats is key. Yet, in the US, legislative initiatives attempting to address the problem have so far been awkward and controversial.

About Christina Gagnier

Christina Gagnier leads the Intellectual Property, Internet & Technology practice at Gagnier Margossian LLP, with a specialization in social media, copyright and information privacy. A member of the State Bar of California, Gagnier has been active in the field of intellectual property since 2002. Gagnier serves as the Chief Executive Officer of TRAIL, managing platforms like JobScout and HealthScout.

Gagnier is currently working on a book about hacker rights and on her other passion, California politics. If you ever need to find her, start with Twitter.


Crowdsourcing large scale identity theft and fraud to make bucket loads of easy money
Jo Prichard

This session will demonstrate to attendees how easy it is to crowdsource identity theft to commit fraud and make money. We will look at which opportunities and segments of the population are easy targets for large scale identity fraud what insights are gained from this analysis and what can be done on the ground to narrow the window of opportunity for these types of operations and schemes. We will discuss at-risk identities, synthetic identities and look specifically at Tax Refund Fraud and Disaster Relief Fraud. Attendees will learn about how LexisNexis uses big data and massive compute capability to tackle identity theft in scale. Two insights will be shared about protecting your own identity and a simple strategy for protecting yourself from tax refund fraud.

About Jo Prichard

Mr. Prichard is an Architect at LexisNexis Risk Solutions. He has responsibilities to the core HPCC Systems platform technology. He spearheads large scale graph analytics projects working with big data, for various industries to help customers target fraud, collusion and other red flag indicates. Prior to LexisNexis, Mr. Prichard worked for Topspeed Software R&D in London and was a conference speaker on various aspects of the Clarion programming language. He also worked for the largest insurance company in South Africa to create systems to help customers receive immediate insurance quotes.


Using data for EVIL: a beginners guide
Duncan Ross and Fran Bennett

Being good is hard. Being evil is much more fun and gets you paid a lot more. Duncan and Fran will give an introduction to some of the simplest things you can do to make the maximum (negative) impact on your friends, your business and the world. This session will give you a quick and easy guide to becoming an evil overlord of data without really trying. With (unfortunately) anonymized examples from the real world we will show how ordinary data scientists can have a real impact on the world around them, with very little effort.

You could use this to consider how to avoid ethical dilemmas, to develop ways to deal responsibly with data, or even to do good. But that would be perverse.

About Duncan Ross

Duncan Ross has been a data miner since the mid 1990s. He was Director of Advanced Analytics at Teradata until 2010, leaving to become Data Director of Experian UK. He recently rejoined Teradata to lead their International Data Science team.

In his spare time Duncan has been a city Councillor, chair of a UK charity, founded an award winning farmers' market, and is one of the founding Directors of the Society of Data Miners. Evil rating: *****

About Fran Bennett

Francine Bennett is a data scientist, and is the CEO and cofounder of Mastodon C. Mastodon C are agile big data specialists, who offer the open source technology platform and the technical and analytical skills which help companies realise the potential of their data.

Before founding Mastodon C, she spent a number of years working on big data analysis for search engines, helping them to turn lots of data into even more money. She enjoys good coffee, running, sleeping as much as possible, and exploring large datasets. Evil rating: ****


Black-hat Data Science
Joseph Turian

If you're an evil genius with a yen for data science, what are your possible attack vectors? If you're a good guy and want to protect your data from unscrupulous competitors, what are your counter-attacks? How effective are they? This talk will focus on data science attack vectors that can be exploited for commercial, not military, gain.

About Joseph Turian

Joseph Turian, Ph.D., heads MetaOptimize LLC, which consults on predictive analytics, business intelligence, NLP, ML, and data strategy. He also run the MetaOptimizeQ&A site, where Machine Learning and Natural Language Processing experts share their knowledge. He specializes in large data sets.

Joseph Turian holds a Ph.D. in computer science (with a focus on Machine Learning and Natural Language Processing) from New York University since 2007. During his graduate studies, he developed a fast, large-scale machine learning method for parsing natural language. He received his AB from Harvard University in 2001.


What to do when your Machine Learning get attacked
Vishwanath Ramarao

Classic data science problems involve finding stationary patterns in big datasets. However, in adversarial settings, enemies deliberately shift their approach to avoid detection. They can challenge learning systems by randomizing behavior, hiding tracks, lacing traffic and more. Successful application of machine learning requires new approaches to feature engineering, training and classification.

In this talk, I cover engineering best practices for ML system design when an adversary is present. I will be covering how to do better feature engineering, learning system design and feedback management to make learning systems less susceptible to gaming and manipulation.

About Vishwanath Ramarao

Vishwanath Ramarao is the co-founder and CTO of Impermium, a redwood city startup focused on keeping the web safe. Prior to Impermium, Vish lead data Insights, antispam and search teams across yahoo. Vish has several years of experience working on machine learning, optimization, numerical computing problems and drug discovery problems.

Strata 2013 Complete Video Compilation

Questions? Please send email to