O'Reilly logo

BGP by Iljitsch van Beijnum

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 1. The Internet, Routing, and BGP

One of the many remarkable qualities of the Internet is that it has scaled so well to its current size. This doesn’t mean that nothing has changed since the early days of the ARPANET in 1969. The opposite is true: our current TCP and IP protocols weren’t constructed until the late 1970s. Since that time, TCP/IP has become the predominant networking protocol for just about every kind of digital communication.

The story goes that the Internet—or rather the ARPANET, which is regarded as the origin of today’s Internet—was invented by the military as a network that could withstand a nuclear attack. That isn’t how it actually happened. In the early 1960s, Paul Baran, a researcher for the RAND Corporation, wrote a number of memoranda proposing a digital communications network for military use that could still function after sustaining heavy damage from an enemy attack.[2] Using simulations, Baran proved that a network with only three or four times as many connections as the minimum required to operate comes close to the theoretical maximum possible robustness. This of course implies that the network adapts when connections fail, something the telephone network and the simple digital connections of that time couldn’t do, because every connection was manually configured. Baran incorporated numerous revolutionary concepts into his proposed network: packet switching, adaptive routing, the use of digital circuits to carry voice communication, and encryption inside the network. Many people believed such a network couldn’t work, and it was never built.

Several years later, the Department of Defense’s Advanced Research Project Agency (ARPA) grew unsatisfied with the fact that many universities and other research institutions that worked on ARPA projects were unable to easily exchange results on computer-related work. Because computers from the many different vendors used different operating systems and languages, and because they were usually customized to some extent by their users, it was extremely hard to make a program developed on one computer run on another machine. ARPA wanted a network that would enable researchers to access computers located at different research institutions throughout the United States.

Access to a remote computer wasn’t a novelty in the late 1960s: connecting remote terminals over a phone line or dedicated circuit was complex but nonetheless a matter of routine. In these situations, however, the mainframe or minicomputer always controlled the communication: a user typed a command, the characters were sent to the central computer, the computer sent back the results after some time, and the terminal displayed them on the screen or on paper. Connecting two computers together was still a rather revolutionary concept, and the research institutions didn’t like the idea of connecting their computers to a network one bit. Only after it was decided that dedicated minicomputers would be used to perform all network-related tasks were people persuaded to connect their systems to the network. The use of minicomputers as Interface Message Processors (IMPs) made building the network a lot easier: rather than having to deal with a large number of very different systems on the network, each computer had to talk only to the local IMP, and the IMPs only to a single local computer and, over the network, to other IMPs. Today’s routers function in a similar way to the ARPANET IMPs.

During the 1970s, the ARPANET continued to evolve. The original Network Control Protocol (NCP) was replaced by two different protocols: the Internet Protocol (IP), which connects (internetworks) different networks, and the Transport Control Protocol (TCP), which applications use to communicate without having to deal with the intricacies of IP. IP and TCP are often mentioned together as TCP/IP to encompass the entire family of related protocols used on the Internet.

Topology of the Internet

Because it’s a “network of networks,” there was always a need to interconnect the different networks that together form the global Internet. In the beginning, everyone simply connected to the ARPANET, but over the years, the topology of the Internet has changed radically.

The NSFNET Backbone

During the late 1980s, the ARPANET was replaced as the major “backbone” of the Internet by a new National Science Foundation-sponsored network between five supercomputer locations: the NSFNET Backbone. Federal Internet Exchanges on the East and West Coasts (FIX East and FIX West) were built in 1989 to aid in the transition from the ARPANET to the NSFNET Backbone. Originally, the FIXes were 10-Mbps Ethernets, but 100-Mbps FDDI was added later to increase bandwidth. The Commercial Internet Exchange (CIX, “kicks”) on the West Coast came into existence because the people in charge of the FIXes were hesitant to connect commercial networks. CIX operated a CIX router and several FDDI rings for some time, but it abandoned those activities and turned into a trade association in the late 1990s. In 1992, Metropolitan Fiber Systems (MFS, now Worldcom) built a Metropolitan Area Ethernet (MAE) in the Washington, DC, area, which quickly became a place where many different (commercial) networks interconnected. Interconnecting at an Internet Exchange (IX) or MAE is attractive, because many networks connect to the IX or MAE infrastructure, so all that’s needed is a single physical connection to interconnect with many other networks.

Commercial Backbones and NAPs

Before the early 1990s, the Internet was almost exclusively used as a research network. Some businesses were connected, but this was limited to their research divisions. All this changed when email became more pervasive outside the research community, and the World Wide Web made the network much more visible. More and more business and nonresearch organizations connected to the network, and the additional traffic became a burden for the NSFNET Backbone. Also, the NSFNET Backbone Acceptable Use Policy didn’t allow “for-profit activities.” In 1995, the NSFNET Backbone was decommissioned, giving room to large ISPs to compete with each other by operating their own backbone networks. To ensure connectivity between the different networks, four contracts for Network Access Points (NAPs) were awarded by the NSF, each run by a different telecommunication company:

  • The Pacific Bell NAP in San Jose, California

  • The Ameritech NAP in Chicago, Illinois

  • The Sprint NAP in Pennsauken, New Jersey (in the Philadelphia metropolitan area, but often referred to as “the New York NAP”)

  • The already existing MAE East,[3] run by MCI Worldcom, in Vienna, Virginia

The NAPs were created as large-scale exchange points where commercial networks could interconnect without being limited by the NSFNET Acceptable Use Policy. The NAPs were also used to interconnect with a new national research network for high-bandwidth applications, the “very high performance Backbone Network Service” (vBNS).

The Ameritech (Chicago) NAP was built on ATM technology from the start; the Sprint (New Jersey) and PacBell (San Francisco) NAPs used FDDI at first and migrated to ATM later. MAE East also adopted FDDI in addition to Ethernet at this point, and the (Worldcom-trademarked) acronym was quickly changed to mean “Metropolitan Area Exchange.” After decommissioning the last FDDI location in 2001, MAE East is now ATM-only as well. Note that it’s possible to interconnect Ethernet and FDDI at the datalink level (bridge), so if an IX uses both, a connection to either suffices. However, it isn’t possible to bridge easily from Ethernet or FDDI to ATM and vice versa. Over the past several years, the importance of the NAPs has diminished as the main interconnect locations for Internet traffic. Large networks are showing a tendency to interconnect privately, and smaller networks are looking more and more at regional public interconnect locations. There are now numerous small Internet Exchanges in the United States, and in addition to Worldcom, two other companies now operate Internet Exchanges as a commercial service: Equinix and PAIX. Figure 1-1 shows the distribution of NAPs, MAEs, Equinix Internet Business Exchanges, and PAIX exchanges.

Distribution of interconnect locations in the United States

Figure 1-1. Distribution of interconnect locations in the United States

The Rest of the World

The traffic volumes for the Internet Exchanges in Europe and the Asia/Pacific region were much lower at the time the NAPs were being created, so these exchange were not forced to adopt expensive (FDDI) or then still immature (ATM) technologies as the American NAPs were. Because Ethernet is cheap, easier to configure than ATM, and conveniently available in several speeds, most of the non-NAP and non-MAE Internet Exchanges use Ethernet. There are also a few that use frame relay, SMDS, or SRP, usually when the Internet Exchange isn’t limited to a single location or a small number of locations but allows connections to any ISP office or point of presence (POP) within a metropolitan area.

In Europe, most countries have an Internet Exchange. From an international perspective, the main ones are the London Internet Exchange (LINX), the Amsterdam Internet Exchange (AMS-IX), and the Deutsche Commercial Internet Exchange (DE-CIX) in Frankfurt. Internet Exchanges in the rest of the world haven’t yet reached the scale of those in the United States and Europe and are used mainly to exchange national traffic.

Transit and Peering

When a customer connects to an Internet service provider (ISP), the customer pays. This seems natural. Because the customer pays, the ISP has to carry packets to and from all possible destinations worldwide for this customer. This is called transit service. Smaller ISPs buy transit from larger ISPs, just as end-user organizations do. But ISPs of roughly similar size also interconnect in a different way: they exchange traffic as equals. This is called peering, and typically, there is no money exchanged. Unlike transit, peering traffic always has one network (or one of its customers) as the source and the other network (or one of its customers) as its destination. Chapter 12 offers more details on interconnecting with other networks and peering.

Classification of ISPs

All ISPs aren’t created equal: they range from huge, with worldwide networks, to tiny, with only a single Ethernet as their “backbone.” Generally, ISPs are categorized in three groups:


Tier-1 ISPs are so large they don’t pay anyone else for transit. They don’t have to, because they peer with all other tier-1 networks. All other networks pay at least one tier-1 ISP for transit, so peering with all tier-1 ISPs ensures connectivity to the entire Internet.


Tier-2 ISPs have a sizable network of their own, but they aren’t large enough to convince all tier-1 networks to peer with them, so they get transit service from at least one tier-1 ISP.


Tier-3 ISPs don’t have a network to speak of, so they purchase transit service from one or more tier-1 or tier-2 ISPs that operate in the area. If they peer with other networks, it’s usually at just a single exchange point. Many don’t even multihome.

The line between tier-1 networks and the largest tier-2 is somewhat blurred, with some tier-2 networks doing “paid peering” with tier-1 networks and calling themselves tier-1. The real difference is that tier-2 networks generally have a geographically limited presence. For instance, even some very large European networks with trans-Atlantic connections of their own pay a U.S. network for transit, rather than interconnecting with a large number of other networks at NAPs throughout the United States. Because tier-1 networks see these regional ISPs as potential customers, they are less likely to peer with them. This goes double for tier-3 networks. Tier-2 networks, on the other hand, may not peer with many tier-1 networks, but they often peer with all other tier-2 networks operating in the same region and with many tier-3 networks.

[2] The “On Distributed Communications” series is available online at http://www.rand.org/publications/RM/baran.list.html.

[3] There was now also a MAE West, interconnected with FIX West.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required