Real-World Active Learning

Introduction

The online world has blossomed with machine-driven riches. We don’t send letters; we email. We don’t look up a restaurant in a guide book; we look it up on OpenTable. When a computer that makes any of this possible goes wrong, we even search for a solution online. We thrive on the multitude of “signals” available.

But where there’s signal, there’s “noise”—inaccurate, inappropriate, or simply unhelpful information that gets in the way. For example, in receiving email, we also fend off spam; while scouting for new employment, we receive automated job referrals with wildly inappropriate matches; and filters made to catch porn may confuse it with medical photos.

We can filter out all of this noise, but at some point it becomes more trouble than it’s worth—that is when machines and their algorithms can make things much easier. To filter spam mail, for example, we can give our machine and algorithm a set of known-good and known-bad emails as examples so the algorithm can make educated guesses while filtering mail.

Even with solid examples, though, algorithms fail and block important emails, filter out useful content, and cause a variety of other problems. As we’ll explore throughout this report, the point at which algorithms fail is precisely where there’s an opportunity to insert human judgment to actively improve the algorithm’s performance.

In a recent article on Wired (“The Huge, Unseen Operation Behind the Accuracy of Google Maps,” 12/08/14), ...

Get Real-World Active Learning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.