Chapter 1. Introduction

What Is Internet Forensics?

Forensics is the application of scientific methods in criminal investigations. It is a unique field of study that draws from all areas of science, from entomology to genetics, from geology to mathematics, with the single goal of solving a mystery. It holds a great fascination for the general public. Thanks to television dramas, millions of us are familiar with how rifling marks on a bullet can identify a murder weapon and how luminol is used to reveal bloodstains in the bath.

Computer forensics studies how computers are involved in the commission of crimes. In cases ranging from accounting fraud, to blackmail, identity theft, and child pornography, the contents of a hard drive can contain critical evidence of a crime. The analysis of disks and the tracking of emails between individuals have become commonplace tools for law enforcement around the world.

Internet forensics shifts that focus from an individual machine to the Internet at large. With a single massive network that spans the globe, the challenge of identifying criminal activity and the people behind it becomes immense. A con artist in the United States can use a web server in Korea to steal the credit card number of a victim in Germany.

Unfortunately, the underlying protocols that handle Internet traffic were not designed to address the problems of spam, viruses, and so forth. It can be difficult, often impossible, to verify the source of a message or the operator of a web site. In cases like this the minor details become important. The layout of files on a web site or the way that email headers are forged can play the same role as a fingerprint at a physical crime scene.

This book shows you some of the ways in which the bad guys try to conceal their identities. I show you how simple techniques, a knowledge of how the Internet works, and an inquisitive mind can reveal a lot more about these people than they would like.

The Seamy Underbelly of the Internet

History shows us that any situation that involves people and money will quickly attract crime. That has certainly been the case with the Internet. Online crime is at an all-time high and shows no signs of slowing down, despite the best efforts of the computer security industry.

The Scams

Many forms of criminal activity use the Internet as a means of communication, either using email instead of phone calls or publishing offensive material on a web site instead of hard copy. But the Internet has allowed some types of crime to evolve in new ways so as to exploit the new opportunities that it provides.

Spam is the most widespread of these activities . Unsolicited email places a burden on millions of servers every day. Companies spend huge amounts of money on software and staff to help keep the problem under control. They do so to save their employees from having to deal with all of it on their desktops, which would incur even higher costs in the form of lower productivity.

People who are computer savvy tend to focus on the nuisance factor of spam because that is what directly affects us. We tend to overlook the content of those messages because we already know them to be scams . We would never dream of clicking on URLs for web sites that promise us cheap Viagra, great rates on mortgages, or the chance to meet lonely singles in our neighborhoods. But other people do! If they didn’t, then the people running the web sites would not waste their money hiring the spammers to distribute their emails.

Most of these are traditional scams that have been updated to entice Internet-savvy victims. Their goal is to get you to hand over your credit card number. Being able to reach millions of potential victims through the power of spam is what makes it so attractive.

Phishing is the name we give to frauds involving fake web sites that look like those of banks or credit card companies. A phishing email is sent out like most other spam, but it attempts to entice victims by appearing to come from a well-known, legitimate business like Citibank or eBay. The message asks you to click on a URL that takes you to a web site. That web page, at first glance, looks just like the site of the genuine financial institution. The users are prompted to enter their online account information along with other personal details like their date of birth, credit card information, and so forth.

Computer viruses and worms were initially regarded as the malevolent creations of people who wanted to show off their programming skills and wanted to “get in the face” of computer users around the world. The immediate damage they caused ranged from negligible to minor. They were comparable to a graffiti tag spray painted on a wall. Their real impact lay in the effort it took to deal with infected computers and in preventing future attacks. But these threats have become more serious over time. Today’s viruses will actively disrupt the function of antivirus software and prevent such tools from being installed on an already infected system.

Perhaps the most significant development in this field is the convergence of viruses and spam, with certain recent viruses existing solely for the purpose of installing clandestine email servers on the desktop systems they infect. These servers are later employed as relays through which spam emails are sent, and which block the identification of the original sender.

The Numbers

The statistics on these threats are amazing. MessageLabs , a company that provides email security services, tracks their occurrence in the billions of messages that flow through their servers. Their Annual Email Security Report for 2004 paints a discouraging picture (

They report that spam made up 73% of all emails in 2004, with monthly fluctuations peaking at 94% in July of that year. That sounds like an incredibly high percentage, and I was skeptical when I first read it, but a quick, unscientific survey of my Inbox puts my percentage of junk mail into the same range.

Computer viruses were identified in 6% of all emails. Unlike previous years where a range of distinct viruses were rampant, 2004 saw the emergence of variations on a limited set of known viruses. Whether this reflects better anti-virus software or a shift in the approach taken by their creators is a hotly debated issue.

Phishing experienced the most dramatic growth in 2004. MessageLabs saw a monthly average of around 250,000 phishing emails in the first half of the year. But that ramped up rapidly in the second half to reach around 4,500,000 by year-end, an 18-fold increase in 6 months.

Bear in mind that all this activity on the part of the bad guys is taking place in spite of the widespread use of excellent anti-virus software and spam filters. Collectively, we are working really hard on this problem, but we seem to be losing ground.

Why Is It Getting Worse?

Several factors lie behind this seemingly unstoppable growth:

  • Internet scams don’t cost much to set up.

  • The potential audience is huge.

  • The chance of getting caught is low.

  • The chance of getting prosecuted is minimal.

  • People are making money doing it.

The cost involved in setting up a phishing scam is almost negligible. You need a web server that you control, a little programming experience, and some way to send a lot of email messages. That is an investment of a few hundred dollars at most. All you need is one victim to give up their credit data number and you will have turned a profit.

Creating a large spam operation is a more expensive endeavor, as you need a pool of mail servers that can send out the messages. Using commercial servers, the costs are still low relative to the potential rewards, but that expense can be dispensed with entirely if you are able to commandeer the computers of unsuspecting users. That has been the rationale behind the recent computer viruses, which have installed email relay servers on their infected hosts.

The key to reaching the largest possible audience lies in automating the generation and distribution of email messages. Writing good scripts to do this is easy enough, but in the face of rapidly improving spam filters, increasingly more effort is being applied to the automated generation of messages that can evade these defenses. A form of intellectual arms race is starting to take shape between us and them. I hope that this book and the efforts of its readers will help tip the balance in our favor.

The risk of getting caught and convicted should serve as a strong deterrent to crime. Unfortunately the chances of either of these happening on the Internet are slim. The conviction rate for spamming remains so low that any individual case still attracts significant attention in the press. I discuss this more in Chapter 12.

Above all, the number one reason why Internet crime is growing so rapidly is that people are making money doing it. As long as that remains the case, criminals will find the resources they need to make it happen.

Pulling Back the Curtain

Who exactly is involved in Internet crime? The popular media seem to have settled on two very different profiles. The first is the Russian mob that has enlisted physicists, displaced from Cold War era government programs, to help them with their plans. The second is the American teenage boy nerd, seated in the dark isolation of his bedroom, working on the next great computer virus. Neither of these is really representative, although both contain substantial elements of truth. The fact is that the opportunities for this kind of fraud are so broad that someone can find a niche regardless of their technical background.

The advance fee scam, the so-called Nigerian 419 scam , requires nothing more than a good cover story, a list of email addresses, and the gall to carry it out. Creating a computer virus, or operating a professional spam distribution network, requires significant technical expertise. Some scams are so complex that multiple individuals must be involved. For an interesting perspective on a few individuals from the world of spam, I refer you to the book Spam Kings by Brian S. McWilliams (O’Reilly). In it, he describes how two well-known spammers got involved in the trade and how techniques like those described here were used to reveal them.

One thing common to everyone involved in Internet fraud is the desire to remain anonymous and thereby safe from prosecution. The bad guys go to great lengths to hang a curtain of disguise behind which they can operate. The forensic skills that you will learn from this book will help you pull back that curtain.

Just like traditional criminal forensics, you will use your skills to find the clues left behind at a crime scene. The only difference being that our crime scene takes the form of a web site, server, or email message. You are unlikely to uncover the name and address of the culprit, but you will be able to build up a picture of their operation, which can contain a surprising amount of detail.

Taking Back Our Internet

Over and above the immediate desire to identify the bad guys, I think a lot of us feel a deeper unease about their activities.

The developers and systems administrators among us talk about the Open Source Community , the informal collection of people responsible for creating and using Linux, Perl, and all the other tools that we use every day in our work. The word “community” is not just a convenient buzzword. Many of us feel a real sense of belonging to this global movement that has made the Internet what it is today.

No one can truly claim ownership of the Internet, but the Open Source Community can rightfully claim to be its stewards and guardians. As such, we feel betrayed by those who have crossed over to the Dark Side who are responsible for the nuisances and threats that all users now have to deal with.

Many developers have already stepped up to the challenge of taking back the Internet. Spam-filtering tools, firewalls, secure browsers, such as Firefox and Mozilla, along with a host of security patches, have been developed by open source developers for the good of the community. With the forensic techniques described in this book, I want to help advance another approach in this ongoing battle. By identifying the people responsible for these threats, we can put them under a great deal of pressure and force them to work much harder to achieve their goals.

I want this book to show you how easy it can be to uncover clues about Internet scams. You don’t need to be a computer security expert to apply these skills. In fact the key to their success lies in having hundreds and thousands of people like you pushing back and putting pressure on the bad guys. Collectively, we can be a very powerful force.

Protecting Your Privacy

Disclosure and privacy are two sides of the same coin. The same forensic techniques that you use to investigate a phishing web site can be used against you by someone else. The techniques do not discriminate. Privacy is a major concern for some people, less so for others. Regardless of where you fall on that scale, you should always be aware of what others can learn about you. Throughout the book, I will play for both teams. I will show you how to, for example, mine a web site for useful data and then show how, as the operator of a site, you can limit that disclosure.

You can make the argument that, by taking this approach, this book may actually help the scammers evade detection. In some cases, this may happen. However, this same issue has been raised many times in the field of conventional computer security. The counter argument, that I think has prevailed in that field, is that most of the bad guys already know how to improve their operations if they choose to. Either they are just lazy, or they don’t think the chance of being identified is high enough to warrant the effort.

By providing a full disclosure of the ways that scammers use to conceal themselves, and showing how you can still uncover identifying information, Internet forensics forces the bad guys further into a corner. There are many more of us than them, and our collective attention forces them to either work harder to practice their trade or, I hope, decide that it’s not worth the effort.

That is exactly what we have seen with other aspects of computer security. In the Linux community, new security problems are disclosed for all to see as soon as they are discovered. That prompts developers to fix the issues in a timely manner. In the early days, some of the vulnerabilities were serious and undoubtedly their disclosure led to some systems being attacked. But overall the approach has been a resounding success. Vulnerabilities are still being discovered, but their impact is typically much reduced and often they are fixed before any real-world exploit has been created. Full disclosure of the ways scammers work has made life increasingly difficult for system attackers and has undoubtedly led many to focus their attentions elsewhere.

The analogy of an arms race is appropriate. It may be an inefficient way to defeat an enemy but it can be very effective way to control their activities.

Before You Begin

I need to offer a few words of caution before you begin poking around some of the more dubious corners of the Internet.

Viruses, Worms, and Other Threats

Computer viruses and spyware are everyday threats on the Internet. But in actively seeking out and examining dubious web sites, you may be exposing your systems to higher than normal risks. As I describe in Chapter 3, the worlds of spam distribution and computer viruses have already merged in the form of the Sobig virus. This type of threat should not be a problem as long as you take suitable, simple precautions.

A Unix-based operating system, such as Linux or Mac OS X, is the preferred platform from which to investigate dubious web sites and email messages. The Unix environment is less susceptible to computer viruses, with control mechanisms that make it difficult for rogue executables to be installed simply by downloading them.

If you do use a Windows system to follow the techniques and examples given in this book, then you need to take several important precautions. It goes without saying that you need to have good antivirus software installed and running on the system. Not only that, it needs to be kept up to date with current virus definitions. If you are actively exploring web sites, then make sure you scan your system frequently.

The same goes for spyware , which is perhaps even more a problem in the context of visiting web sites. There are some excellent free tools available for finding and eradicating this on Windows computers—for example:

Again, you should scan your system frequently with these tools.

Historically, a major vulnerability on Windows systems has been Internet Explorer itself. A series of vulnerabilities have been exposed, exploited, and then patched over the past few years, giving this browser a poor security reputation. Hopefully those problems are a thing of the past, but if that is a concern, then you might want to use Mozilla Firefox ( as an alternate browser.


All of the techniques that I describe in the book make use of information that people disclose in the emails that they send and the web sites that they host. That information is readily accessible by anyone who knows where to look.

None of the techniques involve breaking into computers or probing them for vulnerabilities. That crosses the line from legitimate investigation into computer cracking, which in most instances is illegal. I do not, in any way, shape, or form, condone that activity.

But, as with most aspects of life, between these black-and-white extremes lies a gray area where things are not so clear-cut. For example, I have no problem mining a fake bank web site for every piece of information about its creators that I can find. But I would not dream of using those same skills to identify the people involved in, say, a support group for recovering addicts. To me, one target is legitimate and the other is not.

As you work your way through the book and apply the techniques to real emails and web sites, take a moment to consider the ethical implications of what you are doing. Use your powers wisely and stay away from the Dark Side!

Innocent Until Proven Guilty

Whenever they show a telephone number on television, they include 555 after the area code. This is a reserved block of numbers that don’t work, which the film companies use to prevent prank calls to regular phone lines. I have taken a similar approach by masking some of the Internet and email addresses that are used in this book.

Throughout the book, you will find many examples of email messages, domain names, URLs, and web pages. These are used to illustrate different techniques, and most are real examples from my Inbox or real sites that I have visited. Most were examples of spam, phishing, or some other dodgy operation, at that point in time. It is important to realize that most web sites that are involved in a scam are short-lived. The chance that any of these sites will still be operational by the time you read this book is minimal. In many of those cases, the specific Internet addresses will have been reassigned to other sites and most will be completely legitimate. Others may represent innocent sites that had been hijacked in order to host a phishing attempt.

You should not make assumptions about the current usage of any specific numeric addresses, hostnames, or web servers that are included in this book.

A Network Neighborhood Watch

Taking back the Internet from the con artists will require more than the efforts of computer security professionals. If it were that easy, then the problem would already have been solved. Educating consumers has undoubtedly helped, but people still fall victim to these scams every day.

I view myself as part of the global community of programmers and systems administrators, the power users of computing and the Internet. I suspect most readers of this book would feel the same affiliation. Given the technical skills that we possess, I feel that we have a collective responsibility to guide the development of the Internet and ensure that the values of freedom and openness are preserved as it continues to evolve.

We have the potential to make life very difficult for those behind Internet scams. With thousands of us working to reveal them, their sense of security will be threatened. I believe that this sense that nobody can touch them is a major reason for the growth of Internet crime. A community-based effort to uncover these scams has the potential to have a major impact. We need an effort similar to that of ordinary people who take part in a Neighborhood Watch to keep crime away from where they live simply by keeping an eye out for each other. We need a Network Neighborhood Watch.

This book will show you how to uncover information about web sites, servers, and email messages. It was written for anyone with modest computer skills, as opposed to the professional computer security expert. Anyone can apply these techniques. They use the basic tools and protocols of the Internet in creative ways to reveal clues that mostly go unnoticed. I think most readers will be surprised just how much can be revealed.

I encourage you to learn and experiment with the techniques, scripts, and hacks that are described here. If your Inbox is anything like mine, then you already have plenty of targets. I hope that you build upon these ideas and go on to share your own with the rest of this community. And I hope that you will do your part to make the life harder for the bad guys and in doing so, make the Internet a better place for all of us.

Get Internet Forensics now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.