O'Reilly logo

Database Nation by Simson Garfinkel

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4. What Did You Do Today?

When I was teenager, I tried keeping a diary. I took out my pen every night before I went to sleep and wrote down the details of the previous day. I had just started dating and soon the book's pages were filled with stories of my teenage romances: I'd write down who I liked and who I didn't; who I had seen at school and who I had talked to on the phone. And, of course, I wrote down the details of my dates themselves: who they were with, where we had gone, what we had eaten, and what we had done.

After a month or so I had created quite an impressive historical record of my teenage exploits. But as time passed, my entries started getting shorter and shorter. It was just too much work to write down all of the details. Ultimately, my project collapsed under the weight of its own data.

Keeping that diary in today's world would be much easier. Every time I buy something with a credit card, I get back a little yellow slip telling me the exact time and location of my purchase. I get a much more detailed receipt at my neighborhood supermarket that lists the name and size of everything in my shopping cart. My airline's frequent flyer statement lists every city that I've flown to over the past year. Should I accidentally throw out the statement, all of this information is stored safely in numerous computer databanks.

Even my telephone calls are carefully recorded, tabulated, and presented to me at the end of each month. I remember in college when my girlfriend broke up with me during a long-distance phone call. We talked for 20 minutes, then she hung up. I called her back again and again; I got her answering machine each time. A few weeks later, the phone bill came in the mail, and there were the calls: one for 20 minutes, and then five calls in rapid succession, each one lasting just 15 seconds.

But by far the most detailed records of my life reside on my computer's hard drive: my stored email messages, going back to my freshman year in college. All told, there are more than 600 megabytes of information—roughly 315,000 pages of double-spaced text, or 40 pages of text for every day since September 3, 1983, when I got my first email account at MIT.

"Keep all your old email messages," my friend Harold told me just before I graduated. "When historians look back at the 1980s, we are the ones they're going to be writing about." And he was right: with keyword searching and advanced text-processing algorithms, it will be a simple matter for some future historian to assemble a very accurate record of my life as a college student—and my life ever since—by examining the written electronic record I've left behind.

But this archive of facts and feelings is a rapier that can slice two different ways. More than my own digital diary, I have also been casting a vast "data shadow " that reveals the secrets of my daily life to anyone who can read it.

Alan Westin coined the term data shadow in the 1960s. Westin, a professor at Columbia University in New York, warned that credit records, bank records, insurance records, and other information that made up America's emerging digital infrastructure could be combined to create a detailed digital dossier. The metaphor, with its slightly sinister feeling, was uncannily accurate: just as few people are aware of where their shadows fall, few data subjects in the future, Westin conjectured, would be able to keep track of their digital dossiers.

In the three decades that have passed since then, the data shadow has grown from an academic conjecture to a concrete reality that affects us all.

We stand at the brink of an information crisis. Never before has so much information about so many people been collected in so many different places. Never before has so much information been made so easily available to so many institutions in so many different ways and for so many different purposes.

Unlike the email that's stored on my laptop, my data shadow is largely beyond my control. Scattered across the computers of a hundred different companies, my shadow stands at attention, shoulder-to-shoulder with an army of other data shadows inside the databanks of corporations and governments all over the world. These shadows are making routine the discovery of human secrets. They are forcing us to live up to a new standard of accountability. And because the information that makes up these shadows is occasionally incorrect, they leave us all vulnerable to punishment or retaliation for actions that we did not even commit.

The good news is that we can fight back against this wholesale invasion of personal privacy. We can fight to stop the capturing of everyday events. And where capture is inevitable, we can establish strong business practices and laws that guarantee the sanctity of our privacy—protection for our shadows to live by. We have done so before. All that's needed is for people to understand how this information is being recorded, and how to make that recording stop.

The Information Crisis

As an experiment, make a list of the data trails that you leave behind on a daily basis. Did you buy lunch with a credit card? Write that down. Did you buy lunch with cash, but visit the automatic teller machine (ATM) beforehand? If so, then that withdrawal makes up your data shadow as well. Every long distance phone call, any time you leave a message inside a voice mailbox, and every web page you access on the Internet—all of these are part of your comprehensive data profile.

You are more likely to leave records if you live in a city, if you pay for things with credit cards, and if your work requires that you use a telephone or a computer. You will leave fewer records if you live in the country or if you are not affluent. This is really no surprise: detailed records are what makes the modern economy possible.

What is surprising, though, is the amount of collateral information that these records reveal. Withdraw cash from an ATM, and a computer records not just how much money you took out, but the fact that you were physically located at a particular place and time. Make a telephone call to somebody who has Caller ID, and a little box records not just your phone number (and possibly your name), but also the exact time that you placed your call. Browse the Internet, and the web server on the other side of your computer's screen doesn't just record every page that you download—it also records the speed of your computer's modem, the kind of web browser you are using, and even your geographical location.

There's nothing terribly new here, either. In 1986, John Diebold wrote about a bank that seven years earlier

had recently installed an automatic teller machine network and noticed "that an unusual number of withdrawals were being made every night between midnight and 2:00 a.m."...Suspecting foul play, the bank hired detectives to look into the matter. It turns out that many of the late-night customers were withdrawing cash on their way to a local red light district![1]

An article about the incident that appeared in the Knight News Service observed: "there's a bank someplace in America that knows which of its customers paid a hooker last night."[2] (Diebold, one of America's computer pioneers in the 1960s and 1970s, had been an advocate of the proposed National Data Center. But by 1986, he had come to believe that building the Data Center would have been a tremendous mistake, because it would have concentrated too much information in one place.)

I call records such as banks' ATM archives hot files . They are juicy, they reveal unexpected information, and they exist largely outside the scope of most people's understanding.

Over the past 15 years, we've seen a growing use of hot files. One of the earliest cases that I remember occurred in the 1980s, when investigators for the U.S. Drug Enforcement Agency started scanning through the records of lawn-and-garden stores and correlating the information with data dumps from electric companies. The DEA project was called Operation Green Merchant; by 1993, the DEA, together with state and local authorities, had seized nearly 4,000 growing operations, arrested more than 1,500 violators, and frozen millions of dollars in illicitly acquired profits and assets.[3] Critics charged that the program was a dragnet that caught both the innocent and the guilty. The investigators were searching out people who were clandestinely raising marijuana in their basements. While the agents did find some pot farmers, they also raided quite a few innocent gardeners—including one who lived next to an editor at the New York Times. The Times eventually wrote an editorial, but it didn't stop the DEA's practices.

Americans got another dose of hot file surprise in the fall of 1987, when President Ronald Reagan nominated Judge Robert Bork to the Supreme Court of the United States. Bork's nomination was fiercely opposed by women's groups, who said that the judge had a history of ruling against women's issues; they feared that Bork would be the deciding vote to help the Court overturn a woman's right to an abortion. Looking for dirt, a journalist from Washington, D.C.'s liberal City Paper visited a video rental store in Bork's neighborhood and obtained a printout from the store's computer of every movie that Bork had ever rented there. The journalist had hoped that Bork would be renting pornographic films. As it turned out, Bork's tastes in video veered towards mild fare: the 146 videos listed on the printout were mostly Disney movies and Hitchcock films.

Nevertheless, Bork's reputation was still somewhat damaged. Some accounts of the Bork story that have been published and many off-handed remarks at cocktail parties often omit the fact that the journalist came up empty in the search for pornography. Instead, these accounts erroneously give the impression that Bork was a fan of porn, or at least allow the reader to draw that conclusion.

The problem with hot files, then, is that they are too hot: on the one hand, they reveal information about us that many people think a dignified society keeps private; on the other hand, they are easily misinterpreted. And it turns out that these records are also easily faked: if the clerk at the video rental store had wanted to do so, that person could easily have added a few dozen porno flicks to the record, and nobody could have proved that the record had been faked.

As computerized record-keeping systems become more prevalent in our society, we are likely to see more and more cases in which the raw data collected by these systems for one purpose is used for another. Indeed, advancing technology makes such releases all the more likely. In the past, computer systems simply could not store all of the information that they could collect: it was necessary to design systems so that they would periodically discard data when it was no longer needed. But today, with the dramatic developments in data storage technology, it's easy to store information for months or years after it is no longer needed. As a result, computers are now retaining an increasingly more complete record of our lives—as they did with Judge Bork's video rental records. Ask yourself this: what business did the video rental store have keeping a list of the movies that Bork had rented, after the movies had been returned?

This sea of records is creating a new standard of accountability for our society. Instead of relying on trust or giving people the benefit of the doubt, we can now simply check the record and see who was right and who was wrong. The ready availability of personal information also makes things easier for crooks, stalkers, blackmail artists, con men, and others who are up to no good. One of the most dramatic cases was the murder of actress Rebecca Schaeffer in 1989. Schaeffer had gone to great lengths to protect her privacy. But a 19-year-old crazed fan, who allegedly wanted to meet her, hired a private investigator to find out her home address. The investigator went to California's Department of Motor Vehicles, which at the time made vehicle registration information available to anyone who wanted it, since the information was part of the public record. The fan then went to Schaeffer's house, waited for four hours, and shot her once in the chest when she opened her front door.[4]



[1] James Finn and Leonard R. Sussman, eds., Today's American: How Free? (New York: Freedom House, 1986), p. 111.

[2] Ibid.

[3] U.S. Department of Justice Drug Enforcement Administration, "U.S. Drug Threat Assessment: 1993. Drug Intelligence Report. Availability, Price, Purity, Use, and Trafficking of Drugs in the United States," September 1993, DEA-93042. Available online at http://mir.drugtext.org/druglibrary/schaffer/GOVPUBS/usdta.htm.

[4] "TV-Movie Actress Slain in Apartment," Associated Press, July 19, 1989. "Arizona Holds Man in Killing of Actress," Associated Press, July 20, 1989. "Suspect in Slaying Paid to Find Actress," Associated Press, July 23, 1989.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required