Buy this Book
Print Book $44.99 PDF $27.99 Read it Now!
Print Book £28.50
Add to UK Cart
Reprint Licensing
Network Troubleshooting Tools
Network Troubleshooting Tools

By Joseph D. Sloan
Book Price: $44.99 USD
£28.50 GBP
PDF Price: $27.99

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Network Management and Troubleshooting
The first step in diagnosing a network problem is to collect information. This includes collecting information from your users as to the nature of the problems they are having, and it includes collecting data from your network. Your success will depend, in large part, on your efficiency in collecting this information and on the quality of the information you collect. This book is about tools you can use and techniques and strategies to optimize their use. Rather than trying to cover all aspects of troubleshooting, this book focuses on this first crucial step, data collection.
There is an extraordinary variety of tools available for this purpose, and more become available daily. Very capable people are selflessly devoting enormous amounts of time and effort to developing these tools. We all owe a tremendous debt to these individuals. But with the variety of tools available, it is easy to be overwhelmed. Fortunately, while the number of tools is large, data collection need not be overwhelming. A small number of tools can be used to solve most problems. This book centers on a core set of freely available tools, with pointers to additional tools that might be needed in some circumstances.
This first chapter has two goals. Although general troubleshooting is not the focus of the book, it seems worthwhile to quickly review troubleshooting techniques. This review is followed by an examination of troubleshooting from a broader administrative context—using troubleshooting tools in an effective, productive, and responsible manner. This part of the chapter includes a discussion of documentation practices, personnel management and professionalism, legal and ethical concerns, and economic considerations. General troubleshooting is revisited in Chapter 12, once we have discussed available tools. If you are already familiar with these topics, you may want to skim or even skip this chapter.
Troubleshooting is a complex process that is best learned through experience. This section looks briefly at how troubleshooting is done in order to see how these tools fit into the process. But while every problem is different, a key step is collecting information.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
General Approaches to Troubleshooting
Troubleshooting is a complex process that is best learned through experience. This section looks briefly at how troubleshooting is done in order to see how these tools fit into the process. But while every problem is different, a key step is collecting information.
Clearly, the best way to approach troubleshooting is to avoid it. If you never have problems, you will have nothing to correct. Sound engineering practices, redundancy, documentation, and training can help. But regardless of how well engineered your system is, things break. You can avoid troubleshooting, but you can't escape it.
It may seem unnecessary to say, but go for the quick fixes first. As long as you don't fixate on them, they won't take long. Often the first thing to try is resetting the system. Many problems can be resolved in this way. Bit rot, cosmic rays, or the alignment of the planets may result in the system entering some strange state from which it can't exit. If the problem really is a fluke, resetting the system may resolve the problem, and you may never see it again. This may not seem very satisfying, but you can take your satisfaction in going home on time instead.
Keep in mind that there are several different levels in resetting a system. For software, you can simply restart the program, or you may be able to send a signal to the program so that it reloads its initialization file. From your users' perspective, this is the least disruptive approach. Alternately, you might restart the operating system but without cycling the power, i.e., do a warm reboot. Finally, you might try a cold reboot by cycling the power.
You should be aware, however, that there can be some dangers in resetting a system. For example, it is possible to inadvertently make changes to a system so that it can't reboot. If you realize you have done this in time, you can correct the problem. Once you have shut down the system, it may be too late. If you don't have a backup boot disk, you will have to rebuild the system. These are, fortunately, rare circumstances and usually happen only when you have been making major changes to a system.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Need for Troubleshooting Tools
The best time to prepare for problems is before you have them. It may sound trite, but if you don't understand the normal behavior of your network, you will not be able to identify anomalous behavior. For the proper management of your system, you must have a clear understanding of the current behavior and performance of your system. If you don't know the kinds of traffic, the bottlenecks, or the growth patterns for your network, then you will not be able to develop sensible plans. If you don't know the normal behavior, you will not be able to recognize a problem's symptoms when you see them. Unless you have made a conscious, aggressive effort to understand your system, you probably don't understand it. All networks contain surprises, even for the experienced administrator. You only have to look a little harder.
It might seem strange to some that a network administrator would need some of the tools described in this book, and that he wouldn't already know the details that some of these tools provide. But there are a number of reasons why an administrator may be quite ignorant of his network.
With the rapid growth of the Internet, turnkey systems seem to have grown in popularity. A fundamental assumption of these systems is that they are managed by an inexperienced administrator or an administrator who doesn't want to be bothered by the details of the system. Documentation is almost always minimal. For example, early versions of Sun Microsystems' Netra Internet servers, by default, did not install the Unix manpages and came with only a few small manuals. Print services were disabled by default.
This is not a condemnation of turnkey systems. They can be a real blessing to someone who needs to go online quickly, someone who never wants to be bothered by such details, or someone who can outsource the management of her system. But if at some later time she wants to know what her turnkey system is doing, it may be up to her to discover that for herself. This is particularly likely if she ever wants to go beyond the basic services provided by the system or if she starts having problems.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Troubleshooting and Management
Troubleshooting does not exist in isolation from network management. How you manage your network will determine in large part how you deal with problems. A proactive approach to management can greatly simplify problem resolution. The remainder of this chapter describes several important management issues. Coming to terms with these issues should, in the long run, make your life easier.
As a new administrator, your first step is to assess your existing resources and begin creating new resources. Software sources, including the tools discussed in this book, are described and listed in Appendix A. Other sources of information are described in Appendix B.
The most important source of information is the local documentation created by you or your predecessor. In a properly maintained network, there should be some kind of log about the network, preferably with sections for each device. In many networks, this will be in an abysmal state. Almost no one likes documenting or thinks he has the time required to do it. It will be full of errors, out of date, and incomplete. Local documentation should always be read with a healthy degree of skepticism. But even incomplete, erroneous documentation, if treated as such, may be of value. There are probably no intentional errors, just careless mistakes and errors of omission. Even flawed documentation can give you some sense of the history of the system. Problems frequently occur due to multiple conflicting changes to a system. Software that may have been only partially removed can have lingering effects. Homegrown documentation may be the quickest way to discover what may have been on the system.
While the creation and maintenance of documentation may once have been someone else's responsibility, it is now your responsibility. If you are not happy with the current state of your documentation, it is up to you to update it and adopt policies so the next administrator will not be muttering about you the way you are muttering about your predecessors.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Host Configurations
The goal of this chapter is to review system administration from the perspective of the individual hosts on a network. This chapter presumes that you have a basic understanding of system administration. Consequently, many of the more basic issues are presented in a very cursory manner. The intent is more to jog your memory, or to fill an occasional gap, than to teach the fundamentals of system administration. If you are new to system administration, a number of the books listed in Appendix B provide excellent introductions. If, on the other hand, you are a knowledgeable system administrator, you will probably want to skim or even skip this chapter.
Chapter 1 lists several reasons why you might not know the details of your network and the computers on it. This chapter assumes that you are faced with a networked computer and need to determine or reconstruct its configuration. It should be obvious that if you don't understand how a system is configured, you will not be able to change its configuration or correct misconfigurations. The tools described in this chapter can be used to discover or change a host's configuration.
As discussed in Chapter 1, if you have documentation for the system, begin with it. The assumption here is that such documentation does not exist or that it is incomplete. The primary focus is network configuration, but many of the techniques can easily be generalized.
If you have inherited a multiuser system that has been in service for several years with many undocumented customizations, reconstructing its configuration can be an extremely involved and extended process. If your system has been compromised, the intruder has taken steps to hide her activity, and you aren't running an integrity checker like tripwire, it may be virtually impossible to discover all her customizations. (tripwire is discussed briefly in Chapter 11.) While it may not be feasible, you should at least consider reinstalling the system from scratch. While this may seem draconian, it may ultimately be much less work than fighting the same battles over and over, as often happens with compromised systems. The best way to do this is to set up a replacement system in parallel and then move everyone over. This, of course, requires a second system.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Utilities
Reviewing system configuration files is a necessary step that you will have to address before you can claim mastery of a system. But this can be a very time-consuming step. It is very easy to overlook one or more key files. If you are under time pressure to resolve a problem, configuration files are not the best place to start.
Even if you plan to jump into the configuration files, you will probably want a quick overview of the current state of the system before you begin. For this reason, we will examine status and configuration utilities first. This approach has the advantage of being pretty much the same from one version of Unix to the next. With configuration files, the differences among the various flavors of Unix can be staggering. Even when the files have the same functionality and syntax, they can go by different names or be in different directories. Certainly, using these utilities is much simpler than looking at kernel configuration files.
The output provided by these utilities may vary considerably from system to system and will depend heavily on which options are used. In practice, this should present no real problem. Don't be alarmed if the output on your system is formatted differently.
The first thing any system administrator should do on a new system is run the ps command. You are probably already familiar with ps so I won't spend much time on it. The ps command lists which processes are running on the system. Here is an example:
bsd4# ps -aux
USER     PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
root    6590 22.0  2.1   924  616  ??  R    11:14AM   0:09.80 inetd: chargen [2
root       1  0.0  0.6   496  168  ??  Ss   Fri09AM   0:00.03 /sbin/init --
root       2  0.0  0.0     0    0  ??  DL   Fri09AM   0:00.52  (pagedaemon)
root       3  0.0  0.0     0    0  ??  DL   Fri09AM   0:00.00  (vmdaemon)
root       4  0.0  0.0     0    0  ??  DL   Fri09AM   0:44.05  (syncer)
root     100  0.0  1.7   820  484  ??  Ss   Fri09AM   0:02.14 syslogd
daemon   109  0.0  1.5   828  436  ??  Is   Fri09AM   0:00.02 /usr/sbin/portmap
root     141  0.0  2.1   924  616  ??  Ss   Fri09AM   0:00.51 inetd
root     144  0.0  1.7   980  500  ??  Is   Fri09AM   0:03.14 cron
root     150  0.0  2.8  1304  804  ??  Is   Fri09AM   0:02.59 sendmail: accepti
root     173  0.0  1.3   788  368  ??  Is   Fri09AM   0:01.84 moused -p /dev/ps
root     213  0.0  1.8   824  508  v1  Is+  Fri09AM   0:00.02 /usr/libexec/gett
root     214  0.0  1.8   824  508  v2  Is+  Fri09AM   0:00.02 /usr/libexec/gett
root     457  0.0  1.8   824  516  v0  Is+  Fri10AM   0:00.02 /usr/libexec/gett
root    6167  0.0  2.4  1108  712  ??  Ss    4:10AM   0:00.48 telnetd
jsloan  6168  0.0  0.9   504  252  p0  Is    4:10AM   0:00.09 -sh (sh)
root    6171  0.0  1.1   464  320  p0  S     4:10AM   0:00.14 -su (csh)
root       0  0.0  0.0     0    0  ??  DLs  Fri09AM   0:00.17  (swapper)
root    6597  0.0  0.8   388  232  p0  R+   11:15AM   0:00.00 ps -aux
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
System Configuration Files
A major problem with configuration files under Unix is that there are so many of them in so many places. On a multiuser system that provides a variety of services, there may be scores of configuration files scattered among dozens of directories. Even worse, it seems that every implementation of Unix is different. Even different releases of the same flavor of Unix may vary. Add to this the complications that multiple applications contribute and you have a major undertaking. If you are running a number of different platforms, you have your work cut out for you.
For these reasons, it is unrealistic to attempt to give an exhaustive list of configuration files. It is possible, however, to discuss configuration files by categories. The categories can then serve as a guide or reminder when you construct your own lists so that you don't overlook an important group of files. Just keep in mind that what follows is only a starting point. You will have to discover your particular implementations of Unix one file at a time.
There are a number of fairly standard configuration files that seem to show up on most systems. These are usually, but not always, located in the /etc directory. (For customization, you may see a number of files in the /usr/local or /usr/opt directories or their subdirectories.) When looking at files, this is clearly the first place to start. Your system will probably include many of the following: defaultdomain, defaultroute, ethers, gateways, host.conf, hostname, hosts, hosts.allow, hosts.equiv, inetd.conf, localhosts, localnetworks, named.boot, netmasks, networks, nodename, nsswitch.conf, protocols, rc, rc.conf, rc.local, resolv.conf, and services. You won't find all of these on a single system. Each version and release will have its own conventions. For example, Solaris puts the host's name in nodename. With BSD, it is set in rc.conf. Customizations may change these as well. Thus, the locations and names of files will vary from system to system.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Microsoft Windows
Networking with Windows can be quite complicated, since it may involve Microsoft's proprietary enhancements. Fortunately, Microsoft's approach to TCP/IP is pretty standard. As with Unix, you can approach the various versions of Windows by looking at configuration parameters or by using utilities to examine the current configuration. For the most part, you won't be examining files directly under Windows, at least for versions later than Windows for Workgroups. Rather, you'll use the utilities that Windows provides. (There are exceptions. For example, like Unix, Windows has hosts, protocol, and services files.)
If you are looking for basic information quickly, Microsoft provides one of two programs for this purpose, depending on which version of Windows you use. The utility winipcfg is included with Windows 95/98. A command-line program, ipconfig, is included with Windows NT and Windows 2000 and in Microsoft's TCP/IP stack for Windows for Workgroups. Both programs provide the same information. winipcfg produces a pop-up window giving the basic parameters such as the Ethernet address, the IP address, the default route, the name server's address, and so on (see Figure 2-2). You can invoke the program by entering the program name from Run on the start menu or in a DOS window. The most basic parameters will be displayed. Additional information can be obtained by using the /all option or by clicking on the More Info >> button.
Figure 2-2: winipcfg
For ipconfig, start a DOS window. You can use the command switch /all to get the additional details.
As in Unix, the utilities arp, hostname, and netstat are available. All require a DOS window to run. There are a few differences in syntax, but they work basically the same way and provide the same sorts of information. For example, arp -a will list all the entries in the ARP table:
C:\>arp -a

Interface: 205.153.63.30 on Interface 2
  Internet Address      Physical Address      Type
  205.153.63.1          00-00-a2-c6-28-44     dynamic
  205.153.63.239        00-60-97-06-22-22     dynamic
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Connectivity Testing
This chapter describes simple tests for individual network links and for end-to-end connectivity between networked devices. The tools described in this chapter are used to show that there is a functioning connection between two devices. These tools can also be used for more sophisticated testing, including the discovery of path characteristics and the general performance measurements. These additional uses are described in Chapter 4. Tools used for testing protocol issues related to connectivity are described in Chapter 9. You may want to turn next to these chapters if you need additional information in either of these areas.
This chapter begins with a quick review of cabling practices. If your cabling isn't adequate, that's the first thing you need to address. Next, there is a lengthy discussion of using ping to test connectivity along with issues that might arise when using ping, such as security problems. Next, I describe alternatives to ping. Finally, I discuss alternatives that run on Microsoft Windows platforms.
For most managers, cabling is the most boring part of a network. Even administrators who are normally control freaks will often jump at the opportunity to delegate or cede responsibility for cabling to somebody else. It has none of the excitement of new equipment or new software. It is often hidden away in wiring closets, walls, and ceilings. When it is visible, it is usually in the way or an eyesore. The only time most managers think about cabling is when it is causing problems. Yet, unless you are one of a very small minority running a wireless network, it is the core of your network. Without adequate cabling, you don't have a network.
Although this is a book about software tools, not cabling, the topics are not unrelated. If you have a cabling problem, you may need to turn to the tools described later in this chapter to pinpoint the problem. Conversely, to properly use these tools, you can't ignore cabling, as it may be the real source of your problems.
If a cable is damaged, it won't be difficult to recognize the problem. But intermittent cabling problems can be a nightmare to solve. The problem may be difficult to recognize as a cabling problem. It may come and go, working correctly most of the time. The problem may arise in cables that have been in use for years. For example, I once watched a technician try to deal with a small classroom LAN that had been in use for more than five years and would fail only when the network was heavily loaded, i.e., if and only if there was a scheduled class in the room. The problem took weeks before what proved to be a cabling problem was resolved. In the meantime, several classes were canceled.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Cabling
For most managers, cabling is the most boring part of a network. Even administrators who are normally control freaks will often jump at the opportunity to delegate or cede responsibility for cabling to somebody else. It has none of the excitement of new equipment or new software. It is often hidden away in wiring closets, walls, and ceilings. When it is visible, it is usually in the way or an eyesore. The only time most managers think about cabling is when it is causing problems. Yet, unless you are one of a very small minority running a wireless network, it is the core of your network. Without adequate cabling, you don't have a network.
Although this is a book about software tools, not cabling, the topics are not unrelated. If you have a cabling problem, you may need to turn to the tools described later in this chapter to pinpoint the problem. Conversely, to properly use these tools, you can't ignore cabling, as it may be the real source of your problems.
If a cable is damaged, it won't be difficult to recognize the problem. But intermittent cabling problems can be a nightmare to solve. The problem may be difficult to recognize as a cabling problem. It may come and go, working correctly most of the time. The problem may arise in cables that have been in use for years. For example, I once watched a technician try to deal with a small classroom LAN that had been in use for more than five years and would fail only when the network was heavily loaded, i.e., if and only if there was a scheduled class in the room. The problem took weeks before what proved to be a cabling problem was resolved. In the meantime, several classes were canceled.
A full discussion of cabling practices, standards, and troubleshooting has been the topic of several books, so this coverage will be very selective. I am assuming that you are familiar with the basics. If not, several references in Appendix B provide a general but thorough introduction to cabling.
With cabling, as with most things, it is usually preferable to prevent problems than to have to subsequently deal with them. The best way to avoid cabling problems is to take a proactive approach. While some of the following suggestions may seem excessive, the costs are minimal when compared to what can be involved in solving a problem.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Testing Adapters
While most problems with adapters, such as Ethernet cards, are configuration errors, sometimes adapters do fail. Without getting into the actual electronics, there are generally three simple tests you can make with adapters. However, each has its drawbacks:
  • If you have some doubts about whether the problem is in the adapter or network, you might try eliminating the bulk of the network from your tests. The easiest approach is to create a two-computer network using another working computer. If you use coaxial cable, simply run a cable known to be good between the computers and terminate each end appropriately. For twisted pair, use a crossover cable, i.e., a patch cable with send and receive crossed. If all is well, the computers should be able to communicate. If they don't, you should have a pretty clear idea of where to look next.
    The crossover cable approach is analogous to setting up a serial connection using a null modem. You may want to first try this method with two working computers just to verify you are using the right kind of cable. You should also be sure IP numbers and masks are set appropriately on each computer. Clearly, the drawbacks with this approach are shuffling computers around and finding the right cable. But if you have a portable computer available, the shuffling isn't too difficult.
  • A second alternative is to use the configuration and test software provided by the adapter's manufacturer. If you bought the adapter as a separate purchase, you probably already have this software. If your adapter came with your computer, you may have to go to the manufacturer's web page and download the software. This approach can be helpful, particularly with configuration errors. For example, a combination adapter might be configured for coaxial cable while you are trying to use it with twisted pair. You may be able to change interrupts, DMA channels, memory locations, bus mastering configuration, and framing types with this software.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Software Testing with ping
Thus far, I have described ways to examine electrical and mechanical problems. The tools described in this section, ping and its variants, focus primarily on the software problems and the interaction of software with hardware. When these tools successfully communicate with remote systems, you have established basic connectivity. Your problem is almost certainly at a higher level in your system.
With these tools, you begin with the presumption that your hardware is working correctly. If the link light is out on the local host, these tools will tell you nothing you don't already know. But if you simply suspect a hardware problem somewhere on your network, these tools may help you locate the problem. Once you know the location of the problem, you will use the techniques previously described to resolve it. These tools can also provide insight when your hardware is marginal or when you have intermittent failures.
While there are several useful programs for analyzing connectivity, unquestionably ping is the most commonly used program. As it is required by the IP RFC, it is almost always available as part of the networking software supplied with any system. In addition, numerous enhanced versions of ping are available at little or no cost. There are even web sites that will allow you to run ping from their sites.
Moreover, the basic idea has been adapted from IP networks to other protocols. For example, Cisco's implementation of ping has an optional keyword to check connectivity among routers using AppleTalk, DECnet, or IPX. ping is nearly universal.
ping was written by Mike Muuss. Inspired by echo location, the name comes from sounds sonar makes. The name ping is frequently described as an acronym for Packet InterNet Groper. But, according to Muuss's web page, the acronym was applied to the program after the fact by someone else.
It is, in essence, a simple program based on a simple idea. (Muuss describes it as a 1000-line hack that was completed in about one evening.) One network device sends a request for a reply to another device and records the time the request was sent. The device receiving the request sends a packet back. When the reply is received, the round-trip time for packet propagation can be calculated. The receipt of a reply indicates a working connection. This elapsed time provides an indication of the length of the path. Consistency among repeated queries gives an indication of the quality of the connection. Thus,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Microsoft Windows
The various versions of Windows include implementations of ping. With the Microsoft implementation, there are a number of superficial differences in syntax and somewhat less functionality. Basically, however, it works pretty much as you might expect. The default is to send four packets, as shown in the two following examples. In the first, we successfully ping the host www.cabletron.com:
C:\>ping www.cabletron.com

Pinging www.cabletron.com [204.164.189.90] with 32 bytes of data:

Reply from 204.164.189.90: bytes=32 time=100ms TTL=239
Reply from 204.164.189.90: bytes=32 time=100ms TTL=239
Reply from 204.164.189.90: bytes=32 time=110ms TTL=239
Reply from 204.164.189.90: bytes=32 time=90ms TTL=239

C:\>
In the next example, we are unable to reach www.microsoft.com for reasons previously explained:
C:\>ping www.microsoft.com

Pinging microsoft.com [207.46.130.149] with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.
Note that this is run in a DOS window. If you use ping without an argument, you will get a description of the basic syntax and a listing of the various options:
C:\>ping

Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS]
            [-r count] [-s count] [[-j host-list] | [-k host-list]]
            [-w timeout] destination-list

Options:
    -t             Ping the specifed host until interrupted.
    -a             Resolve addresses to hostnames.
    -n count       Number of echo requests to send.
    -l size        Send buffer size.
    -f             Set Don't Fragment flag in packet.
    -i TTL         Time To Live.
    -v TOS         Type Of Service.
    -r count       Record route for count hops.
    -s count       Timestamp for count hops.
    -j host-list   Loose source route along host-list.
    -k host-list   Strict source route along host-list.
    -w timeout     Timeout in milliseconds to wait for each reply.
Notice that the flooding options, fortunately, are absent and that the -t option is used to get an output similar to that used in most of our examples. The implementation does not provide a summary at the end, however.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Path Characteristics
In the last chapter, we attempted to answer a fundamental question, "Do we have a working network connection?" We used tools such as ping to verify basic connectivity. But simple connectivity is not enough for many purposes. For example, an ISP can provide connectivity but not meet your needs or expectations. If your ISP is not providing the level of service you think it should, you will need something to base your complaints on. Or, if the performance of your local network isn't adequate, you will want to determine where the bottlenecks are located before you start implementing expensive upgrades. In this chapter, we will try to answer the question, "Is our connection performing reasonably?"
We will begin by looking at ways to determine which links or individual connections compose a path. This discussion focuses on the tool traceroute. Next, we will turn to several tools that allow us to identify those links along a path that might cause problems. Once we have identified individual links of interest, we will examine some simple ways to further characterize the performance of those links, including estimating the bandwidth of a connection and measuring the available throughput.
This section describes traceroute, a tool used to discover the links along a path. While this is the first step in investigating a path's behavior and performance, it is useful for other tasks as well. In the previous discussion of ping, it was suggested that you work your way, hop by hop, toward a device you can't reach to discover the point of failure. This assumes that you know the path.
Path discovery is also an essential step in diagnosing routing problems. While you may fully understand the structure of your network and know what path you want your packets to take through your network, knowing the path your packets actually take is essential information and may come as a surprise.
Once packets leave your network, you have almost no control over the path they actually take to their destination. You may know very little about the structure of adjacent networks. Path discovery can provide a way to discover who their ISP is, how your ISP is connected to the world, and other information such as peering arrangements.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Path Discovery with traceroute
This section describes traceroute, a tool used to discover the links along a path. While this is the first step in investigating a path's behavior and performance, it is useful for other tasks as well. In the previous discussion of ping, it was suggested that you work your way, hop by hop, toward a device you can't reach to discover the point of failure. This assumes that you know the path.
Path discovery is also an essential step in diagnosing routing problems. While you may fully understand the structure of your network and know what path you want your packets to take through your network, knowing the path your packets actually take is essential information and may come as a surprise.
Once packets leave your network, you have almost no control over the path they actually take to their destination. You may know very little about the structure of adjacent networks. Path discovery can provide a way to discover who their ISP is, how your ISP is connected to the world, and other information such as peering arrangements. traceroute is the tool of choice for collecting this kind of information.
The traceroute program was written by Van Jacobson and others. It is based on a clever use of the Time-To-Live (TTL) field in the IP packet's header. The TTL field, described briefly in the last chapter, is used to limit the life of a packet. When a router fails or is misconfigured, a routing loop or circular path may result. The TTL field prevents packets from remaining on a network indefinitely should such a routing loop occur. A packet's TTL field is decremented each time the packet crosses a router on its way through a network. When its value reaches 0, the packet is discarded rather than forwarded. When discarded, an ICMP TIME_EXCEEDED message is sent back to the packet's source to inform the source that the packet was discarded. By manipulating the TTL field of the original packet, the program traceroute uses information from these ICMP messages to discover paths through a network.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Path Performance
Once you have a picture of the path your traffic is taking, the next step in testing is to get some basic performance numbers. Evaluating path performance will mean doing three types of measurements. Bandwidth measurements will give you an idea of the hardware capabilities of your network, such as the maximum capacity of your network. Throughput measurements will help you discover what capacity your network provides in practice, i.e., how much of the maximum is actually available. Traffic measurements will give you an idea of how the capacity is being used.
My goal in this section is not a definitive analysis of performance. Rather, I describe ways to collect some general numbers that can be used to see if you have a reasonable level of performance or if you need to delve deeper. If you want to go beyond the quick-and-dirty approaches described here, you might consider some of the more advanced tools described in Chapter 9. The tools mentioned here should help you focus your efforts.
Several terms are used, sometimes inconsistently, to describe the capacity or performance of a link. Without getting too formal, let's review some of these terms to avoid potential confusion.
Two factors determine how long it takes to send a packet or frame across a single link. The amount of time it takes to put the signal onto the cable is known as the transmission time or transmission delay. This will depend on the transmission rate (or interface speed) and the size of the frame. The amount of time it takes for the signal to travel across the cable is known as the propagation time or propagation delay. Propagation time is determined by the type of media used and the distance involved. It often comes as a surprise that a signal transmitted at 100 Mbps will have the same propagation delay as a signal transmitted at 10 Mbps. The first signal is being transmitted 10 times as fast, but, once it is on a cable, it doesn't propagate any faster. That is, the difference between 10 Mbps and 100 Mbps is not the speed the bits travel, but the length of the bits.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Microsoft Windows
Most of the tools we have been discussing are available in one form or another for Windows platforms. Microsoft's implementation of traceroute, known as tracert, has both superficial and fundamental differences from the original implementation. Like ping, tracert requires a DOS window to run. We have already seen an example of its output. tracert has fewer options, and there are some superficial differences in their flags. But most of traceroute's options are rarely used anyway, so this isn't much of a problem.
A more fundamental difference between Microsoft's tracert and its Unix relative is that tracert uses ICMP packets rather than UDP packets. This isn't necessarily bad, just different. In fact, if you have access to both traceroute and tracert, you may be able to use this to your advantage in some unusual circumstances. Its behavior may be surprising in some cases. One obvious implication is that routers that block ICMP messages will block tracert, while traceroute's UDP packets will be passed.
As noted earlier in this chapter, Mentor's Java implementation of ttcp runs under Windows if you can find it. Both netperf and iperf have also been ported to Windows. Another freely available program worth considering is Qcheck from Ganymede Software, Inc. This program requires that Ganymede's Performance Endpoints software be installed on systems at each end of the link. This software is also provided at no cost and is available for a wide variety of systems ranging from Windows to MVS. In addition to supporting IP, the software supports SPX and IPX protocols. The software provides ping-like connectivity checks, as well as response time and throughput measurements.
As noted in Chapter 2, Microsoft also provides its own version of netstat. The options of interest here are -e and -s. The -e option gives a brief summary of activity on any Ethernet interface:
C:\>netstat -e
Interface Statistics

                           Received            Sent

Bytes                       9840233         2475741
Unicast packets               15327           16414
Non-unicast packets            9268             174
Discards                          0               0
Errors                            0               0
Unknown protocols               969
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Packet Capture
Packet capture and analysis is the most powerful technique that will be discussed in this book—it is the ultimate troubleshooting tool. If you really want to know what is happening on your network, you will need to capture traffic. No other tool provides more information.
On the other hand, no other tool requires the same degree of sophistication to use. If misused, it can compromise your system's security and invade the privacy of your users. Of the software described in this book, packet capture software is the most difficult to use to its full potential and requires a thorough understanding of the underlying protocols to be used effectively. As noted in Chapter 1, you must ensure that what you do conforms to your organization's policies and any applicable laws. You should also be aware of the ethical implications of your actions.
This chapter begins with a discussion of the type of tools available and various issues involved in traffic capture. Next I describe tcpdump, a ubiquitous and powerful packet capture tool. This is followed by a brief description of other closely related tools. Next is a discussion of ethereal, a powerful protocol analyzer that is rapidly gaining popularity. Next I describe some of the problems created by traffic capture. The chapter concludes with a discussion of packet capture tools available for use with Microsoft Windows platforms.
Packet capture is the real-time collection of data as it travels over networks. Tools for the capture and analysis of traffic go by a number of names including packet sniffers, packet analyzers, protocol analyzers, and even traffic monitors. Although there is some inconsistency in how these terms are used, the primary difference is in how much analysis or interpretation is provided after a packet is captured. Packet sniffers generally do the least amount of analysis, while protocol analyzers provide the greatest level of interpretation. Packet analyzers typically lie somewhere in between. All have the capture of raw data as a core function. Traffic monitors typically are more concerned with collecting statistical information, but many support the capture of raw data. Any of these may be augmented with additional functions such as graphing utilities and traffic generators. This chapter describes
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Traffic Capture Tools
Packet capture is the real-time collection of data as it travels over networks. Tools for the capture and analysis of traffic go by a number of names including packet sniffers, packet analyzers, protocol analyzers, and even traffic monitors. Although there is some inconsistency in how these terms are used, the primary difference is in how much analysis or interpretation is provided after a packet is captured. Packet sniffers generally do the least amount of analysis, while protocol analyzers provide the greatest level of interpretation. Packet analyzers typically lie somewhere in between. All have the capture of raw data as a core function. Traffic monitors typically are more concerned with collecting statistical information, but many support the capture of raw data. Any of these may be augmented with additional functions such as graphing utilities and traffic generators. This chapter describes tcpdump, a packet sniffer, several analysis tools, and ethereal, a protocol analyzer.
While packet capture might seem like a low-level tool, it can also be used to examine what is happening at higher levels, including the application level, because of the way data is encapsulated. Since application data is encapsulated in a generally transparent way by the lower levels of the protocol stack, the data is basically intact when examined at a lower level. By examining network traffic, we can examine the data generated at the higher levels. (In general, however, it is usually much easier to debug an application using a tool designed for that application. Tools specific to several application-level protocols are described in Chapter 10.)
Packet capture programs also require the most technical expertise of any program we will examine. A thorough understanding of the underlying protocol is often required to interpret the results. For this reason alone, packet capture is a tool that you want to become familiar with well before you need it. When you are having problems, it will also be helpful to have comparison systems so you can observe normal behavior. The time to learn how your system works is before you have problems. This technique cannot be stressed enough—do a baseline run for your network periodically and analyze it closely so you know what traffic you expect to see on your network before you have problems.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Access to Traffic
You can capture traffic only on a link that you have access to. If you can't get traffic to an interface, you can't capture it with that interface. While this might seem obvious, it may be surprisingly difficult to get access to some links on your network. On some networks, this won't be a problem. For example, 10Base2 and 10Base5 networks have shared media, at least between bridges and switches. Computers connected to a hub are effectively on a shared medium, and the traffic is exposed. But on other systems, watch out!
Clearly, if you are trying to capture traffic from a host on one network, it will never see the local traffic on a different network. But the problem doesn't stop there. Some networking devices, such as bridges and switches, are designed to contain traffic so that it is seen only by parts of the local network. On a switched network, only a limited amount of traffic will normally be seen at any interface. Traffic will be limited to traffic to or from the host or to multicast and broadcast traffic. If this includes the traffic you are interested in, so much the better. But if you are looking at general network traffic, you will use other approaches.
Not being able to capture data on an interface has both positive and negative ramifications. The primary benefit is that it is possible to control access to traffic with an appropriate network design. By segmenting your network, you can limit access to data, improving security and enhancing privacy.
Lack of access to data can become a serious problem, however, when you must capture that traffic. There are several basic approaches to overcome this problem. First, you can try to physically go to the traffic by using a portable computer to collect the data. This has the obvious disadvantage of requiring that you travel to the site. This may not be desirable or possible. For example, if you are addressing a security problem, it may not be feasible to monitor at the source of the suspected attack without revealing what you are doing. If you need to collect data at multiple points simultaneously, being at different places at the same time is clearly not possible by yourself.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Capturing Data
Packet capture may be done by software running on a networked host or by hardware/software combinations designed specifically for that purpose. Devices designed specifically for capturing traffic often have high-performance interfaces that can capture large amounts of data without loss. These devices will also capture frames with framing errors—frames that are often silently discarded with more conventional interfaces. More conventional interfaces may not be able to keep up with high traffic levels so packets will be lost. Programs like tcpdump give summary statistics, reporting the number of packets lost. On moderately loaded networks, however, losing packets should not be a problem. If dropping packets becomes a problem, you will need to consider faster hardware or, better yet, segmenting your network.
Packet capture software works by placing the network interface in promiscuous mode. In normal operations, the network interface captures and passes on to the protocol stack only those packets with the interface's unicast address, packets sent to a multicast address that matches a configured address for the interface, or broadcast packets. In promiscuous mode, all packets are captured regardless of their destination address.
While the vast majority of interfaces can be placed in promiscuous mode, a few are manufactured not to allow this. If in doubt, consult the documentation for your interface. Additionally, on Unix systems, the operating system software must be configured to allow promiscuous mode. Typically, placing an interface in promiscuous mode requires root privileges.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
tcpdump
The tcpdump program was developed at the Lawrence Berkeley Laboratory at the University of California, Berkeley, by Van Jacobson, Craig Leres, and Steven McCanne. It was originally developed to analyze TCP/IP performance problems. A number of features have been added over time although some options may not be available with every implementation. The program has been ported to a wide variety of systems and comes preinstalled on many systems.
For a variety of reasons, tcpdump is an ideal tool to begin with. It is freely available, runs on many Unix platforms, and has even been ported to Microsoft Windows. Features of its syntax and its file format have been used or supported by a large number of subsequent programs. In particular, its capture software, libpcap, is frequently used by other capture programs. Even when proprietary programs with additional features exist, the universality of tcpdump makes it a compelling choice. If you work with a wide variety of platforms, being able to use the same program on all or most of the platforms can easily outweigh small advantages proprietary programs might have. This is particularly true if you use the programs on an irregular basis or don't otherwise have time to fully master them. It is better to know a single program well than several programs superficially. In such situations, special features of other programs will likely go unused.
Since tcpdump is text based, it is easy to run remotely using a Telnet connection. Its biggest disadvantage is a lack of analysis, but you can easily capture traffic, move it to your local machine, and analyze it with a tool like ethereal. Typically, I use tcpdump in text-only environments or on remote computers. I use ethereal in a Microsoft Windows or X Window environment and to analyze tcpdump files.
The simplest way to run tcpdump is interactively by simply typing the program's name. The output will appear on your screen. You can terminate the program by typing Ctrl-C. But unless you have an idle network, you are likely to be overwhelmed by the amount of traffic you capture. What you are interested in will likely scroll off your screen before you have a chance to read it.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Analysis Tools
As previously noted, one reason for using tcpdump is the wide variety of support tools that are available for use with tcpdump or files created with tcpdump. There are tools for sanitizing the data, tools for reformatting the data, and tools for presenting and analyzing the data.
If you are particularly sensitive to privacy or security concerns, you may want to consider sanitize, a collection of five Bourne shell scripts that reduce or condense tcpdump trace files and eliminate confidential information. The scripts renumber host entries and select classes of packets, eliminating all others. This has two primary uses. First, it reduces the size of the files you must deal with, hopefully focusing your attention on a subset of the original traffic that still contains the traffic of interest. Second, it gives you data that can be distributed or made public (for debugging or network analysis) without compromising individual privacy or revealing too much specific information about your network. Clearly, these scripts won't be useful for everyone. But if internal policies constrain what you can reveal, these scripts are worth looking into.
The five scripts included in sanitize are sanitize-tcp, sanitize-syn-fin, sanitize-udp, sanitize-encap, and sanitize-other. Each script filters out inappropriate traffic and reduces the remaining traffic. For example, all non-TCP packets are removed by sanitize-tcp and the remaining TCP traffic is reduced to six fields—an unformatted timestamp, a renumbered source address, a renumbered destination address, the source port, a destination address, and the number of data bytes in the packet.
934303014.772066 205.153.63.30.1174 > 205.153.63.238.23: . ack 3259091394 win 8647 (DF)
                         4500 0028 b30c 4000 8006 2d84 cd99 3f1e
                         cd99 3fee 0496 0017 00ff f9b3 c241 c9c2
                         5010 21c7 e869 0000 0000 0000 0000
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Packet Analyzers
Even with the tools just described, the real limitation with tcpdump is interpreting the data. For many uses, tcpdump may be all you need. But if you want to examine the data within packets, a packet sniffer is not enough. You need a packet analyzer. A large number of packet analyzers are available at tremendous prices. But before you start spending money, you should consider ethereal.
ethereal is available both as an X Windows program for Unix systems and as a Microsoft Windows program. It can be used as a capture tool and as an analysis tool. It uses the same capture engine and file format as tcpdump, so you can use the same filter syntax when capturing traffic, and you can use ethereal to analyze tcpdump files. Actually, ethereal supports two types of filters, capture filters based on tcpdump and display filters used to control what you are looking at. Display filters use a different syntax and are described later in this section.

Section 5.6.1.1: Using ethereal

Usually ethereal will be managed entirely from a windowing environment. While it can be run with command-line options, I've never encountered a use for these. (There is also a text-based version, tethereal.) When you run ethereal, you are presented with a window with three initially empty panes. The initial screen is similar to Figure 5-1 except the panes are empty. (These figures are for the Windows implementation of ethereal, but these windows are almost identical to the Unix version.) If you have a file you want to analyze, you can select File Open. You can either load a tcpdump file created with the -w option or a file previously saved from ethereal.
Figure 5-1: ethereal
To capture data, select Capture Start. You will be presented with a Capture Preferences screen like the one shown in Figure 5-2. If you have multiple interfaces, you can select which one you want to use with the first field. The Count: field is used to limit the number of packets you will collect. You can enter a capture filter, using
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Dark Side of Packet Capture
What you can do, others can do. Pretty much anything you can discover through packet capture can be discovered by anyone else using packet capture in a similar manner. Moreover, some technologies that were once thought to be immune to packet capture, such as switches, are not as safe as once believed.
Switches are often cited as a way to protect traffic from sniffing. And they really do provide some degree of protection from casual sniffing. Unfortunately, there are several ways to defeat the protection that switches provide.
First, many switches will operate as hubs, forwarding traffic out on every port, whenever their address tables are full. When first initialized, this is the default behavior un