One of the really cool things about the Internet is the way you, or anyone, can see how your traffic is being routed across the Net, and what’s happening to it along the way. This comes in very handy when you’re experiencing some sort of problem connecting to another site. With just a few seconds of research, you can often tell exactly where the problem lies, and this in turn can tell you if it’s something you need to fix yourself, something you need to complain to somebody else about, or something that’s essentially out of your control. It also comes in very handy for evaluating the quality of the explanations you get when you bug your ISP about network outages, which in turn can be an important factor in deciding where to host your web site.
The
first network utilities we’re
going to talk about are the ping
and
traceroute
commands. These utilities let you probe
a TCP/IP network (like the Internet) to see where your data packets
are going, how long it’s taking them to get there, and whether
any of them are getting lost along the way. (See Packet-Switching 101 if these concepts are
new to you.)
The ping
command sends a bunch of test packets to
a particular hostname or IP address and measures how long it takes
for them to come back. When you’ve sent enough packets to
satisfy your curiosity, you type Ctrl-C, and the program prints out a
brief summary and exits. Here’s an example:
[jbc@andros jbc]$ ping www.yahoo.com
PING www.yahoo.com (204.71.200.74): 56 data bytes
64 bytes from 204.71.200.74: icmp_seq=0 ttl=248 time=19.5 ms
64 bytes from 204.71.200.74: icmp_seq=1 ttl=248 time=18.5 ms
64 bytes from 204.71.200.74: icmp_seq=2 ttl=248 time=21.4 ms
64 bytes from 204.71.200.74: icmp_seq=3 ttl=248 time=24.4 ms
64 bytes from 204.71.200.74: icmp_seq=4 ttl=248 time=19.5 ms
64 bytes from 204.71.200.74: icmp_seq=5 ttl=248 time=18.5 ms
64 bytes from 204.71.200.74: icmp_seq=6 ttl=248 time=18.5 ms
64 bytes from 204.71.200.74: icmp_seq=7 ttl=248 time=19.5 ms
64 bytes from 204.71.200.74: icmp_seq=8 ttl=248 time=19.5 ms
64 bytes from 204.71.200.74: icmp_seq=9 ttl=248 time=19.5 ms
--- www.yahoo.com ping statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max = 18.5/19.8/24.4 ms
That’s a very respectable set of ping
statistics: I sent 10 packets, got them all back (for 0% packet
loss), and had an average round-trip time of 19.8 milliseconds. The
various people responsible for the route between the machine where I
entered this command and http://www.yahoo.com are doing a fine job.
Now let’s try pinging some other site in a more-distant part of the Net:
[jbc@andros jbc]$ ping www.ontas.com.au
PING www.ontas.com.au (203.60.16.17) from 209.151.249.42 : 56(84) bytes of data.
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=0 ttl=243 time=277.8 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=1 ttl=243 time=275.4 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=2 ttl=243 time=281.6 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=3 ttl=243 time=294.1 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=4 ttl=243 time=288.1 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=5 ttl=243 time=280.7 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=6 ttl=243 time=275.1 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=7 ttl=243 time=273.4 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=8 ttl=243 time=282.5 ms
64 bytes from vws1.southcom.com.au (203.60.16.17): icmp_seq=9 ttl=243 time=271.6 ms
--- www.ontas.com.au ping statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max = 271.6/280.0/294.1 ms
It takes a little longer (280 milliseconds on average), but this still looks pretty healthy, considering that all of my packets are successfully making it to Tasmania and back (from California) in about a quarter second.
How do I know my packets are going to Tasmania? Well, I don’t,
technically. But it seems like a good guess, based on the output of
another essential network debugging utility:
traceroute
. The traceroute
command lets you traverse the route that your data follows between
your machine and some other machine, sending three test packets to
each router along the way. Let’s try it on http://www.ontas.com.au:
[jbc@andros jbc]$ traceroute www.ontas.com.au
traceroute to www.ontas.com.au (203.60.16.17), 30 hops max, 38 byte
packets
1 chancy-colocate.hq.cyberverse.net (209.151.233.1) 5.843 ms 3.555 ms 0.619 ms
2 216.246.13.129 (216.246.13.129) 6.868 ms 7.195 ms 5.374 ms
3 newDuke-bb.softaware.com (207.155.0.34) 5.414 ms 7.280 ms 7.684 ms
4 aar1-serial6-1-1-0.Anaheim.cw.net (208.172.39.33) 8.603 ms 10.074 ms 8.562 ms
5 acr2-loopback.Anaheim.cw.net (208.172.34.62) 7.325 ms 9.941 ms 8.773 ms
6 optus-networks.Anaheim.cw.net (208.172.33.142) 241.478 ms 245.452 ms
241.621 ms
7 POS4-0-0.rr2.optus.net.au (192.65.89.213) 241.389 ms 269.195 ms 251.966 ms
8 GigEth3-0.sg2.optus.net.au (202.139.191.2) 243.714 ms 252.133 ms 240.937 ms
9 POS2-0.mg1.optus.net.au (202.139.124.82) 254.781 ms 263.256 ms 258.124 ms
10 GigEth1-0-0.mb1.optus.net.au (202.139.188.4) 254.089 ms 255.718 ms 255.304 ms
11 202.139.130.94 (202.139.130.94) 271.595 ms 278.407 ms 273.950 ms
12 Ether2-2.fra-core1.hbt.southcom.com.au (203.31.212.161) 279.584 ms 284.519 ms
295.090 ms
13 vws1.southcom.com.au (203.60.16.17) 280.207 ms 299.295 ms 293.986 ms
Reading down from the top of the traceroute
command’s output, I see my packets go:
Through a router owned by Cyberverse, my ISP
Through a machine that doesn’t have a hostname, just an IP addresses (
216.246.13.129
)Through the network of a company called Softaware (my ISP’s upstream provider on this route), with routers whose hostnames end in
softaware.com
Through the network of Cable & Wireless (
cw.net
)Through the network of Optus (an Australian ISP), via routers whose names end with
optus.net.au
.To the network of a company called Southern Internet Services (at hosts whose names end with
southcom.com.au
), which has a web page describing the company as “Tasmania’s Premier ISP” (http://www.southcom.com.au/)
Now you know what good ping
and
traceroute
results look like. What do bad results
look like? Typically you’ll see longer round-trip times,
perhaps greater than 1000 ms (that is, greater than 1 second).
You’ll also probably see lost packets, which show up in the
ping
command’s output as missing numbers in
the ICMP sequence and are summarized in the results printed at the
end. With traceroute
, lost packets show up as
asterisks where the round-trip time for that test packet should be.
Another thing you might see in the results of a
traceroute
command is !H
in
place of a particular packet’s round-trip time; this stands for
“host unreachable,” and is usually a sign of a fairly
serious routing
problem.
The traditional
way you use ping
and traceroute
to troubleshoot a misbehaving TCP/IP connection is to first use
traceroute
to figure out where the packets are
going, then systematically ping
the hosts along
the route to identify where the problem is. At some point a clever
guy named Matt Kimball created a tool to carry out both of
those steps simultaneously, naming the program mtr
(for Matt's traceroute
) (see http://www.bitwizard.nl/mtr/).
If the mtr
utility is installed on your Unix
server, you can run it by entering mtr
followed by
the name of the host you are interested in tracerouting and pinging:
[jbc@andros jbc]$ mtr www.ontas.com.au
When you do, your shell window will display a list of hosts (the same
as that shown by the traceroute
command) down the
left side of the window, with the rest of the window taken up by
constantly updating statistics on the results of repeatedly pinging
each host. The longer you leave mtr
running, the
more data it will gather (see Figure 1-1). When you
are done, type q
to quit back to the shell
prompt.
Get Perl for Web Site Management now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.