Chapter 4. The Ethernet Frame and Full-Duplex Mode
The tutorial in Chapter 3 introduced the Ethernet system and provided a brief look at how it works. In this chapter, we take a more detailed look at the Ethernet frame and the full-duplex mode of operation. You don’t need to know all the details of the frame and Ethernet system operation in order to build and use Ethernets. However, an understanding of these elements can certainly help when designing networks or troubleshooting problems.
The original half-duplex mode Media Access Control (MAC) protocol was designed to allow a set of stations to compete for access to a shared Ethernet channel, based on coaxial cable segments linked with signal repeaters. The half-duplex media access control protocol is based on carrier sense with multiple access and collision detection, which gives rise to the CSMA/CD acronym.
The development of full-duplex media systems made it possible for Ethernet links to operate in full-duplex mode, providing a higher-performance mode of operation than the one supported over shared channels using CSMA/CD. The Auto-Negotiation protocol described in Chapter 5 automatically selects the highest-performance mode of operation over a link, typically resulting in full-duplex mode for Ethernet connections. Today, the vast majority of Ethernet links operate in full-duplex mode, which we will describe in this chapter.
However, half-duplex mode is still supported for Ethernet interfaces operating at 10 or 100 Mb/s over twisted-pair cables, and you may find a station connected to a switch port over a link that is in half-duplex mode. The operation of the original half-duplex mode is described in detail in Appendix B.
A twisted-pair link segment is capable of supporting full-duplex operation by virtue of having two pairs of wires that support data being sent in both directions. A station operating in half-duplex mode while it is connected to a twisted-pair media system may indicate a misconfigured link, or an issue with the Auto-Negotiation system. See Chapter 5 for details.
To simplify the description of these elements, this chapter is in two parts. The first two sections look at the structure of the frame and the full-duplex media access control system. The following two sections examine flow control and describe how the high-level network software on a computer uses Ethernet frames to send data.
The Ethernet Frame
The organization of the Ethernet frame is central to the operation of the system. The Ethernet standard determines both the structure of a frame and when a station is allowed to send a frame. The frame was first defined in the original Ethernet DEC-Intel-Xerox (DIX) standard, and was later redefined and modified in the IEEE 802.3 standard. The changes between the two standards were mostly cosmetic, except for the type or length field.
The DIX standard defined a type field in the frame. The first 802.3 standard (published in 1985) specified this field as a length field, with a mechanism that allowed both versions of frames to coexist on the same Ethernet system. Most networking software kept using the type field version of the frame. A later version of the IEEE 802.3 standard was changed to define this field of the frame as being either length or type, depending on usage.
Figure 4-1 shows the DIX and IEEE versions of the Ethernet frame. There are three sizes of frame currently defined in the standard, and a given Ethernet interface must support at least one of them. The standard recommends that new implementations support the most recent frame definition, called an envelope frame, which has a maximum size of 2,000 bytes. The two other sizes are basic frames, with a maximum size of 1,518 bytes, and Q-tagged frames with a maximum of 1,522 bytes.
Because the DIX and IEEE basic frames both have a maximum size of 1,518 bytes and are identical in terms of the number and length of fields, Ethernet interfaces can send either DIX or IEEE basic frames. The only difference in these frames is in the contents of the fields and the subsequent interpretation of those contents by the network interface software.
Next, we’ll take a detailed tour of the frame fields.
The frame begins with the 64-bit preamble field, which was originally incorporated to allow 10 Mb/s Ethernet interfaces to synchronize with the incoming data stream before the fields relevant to carrying the content arrived.
The preamble was initially provided to allow for the loss of a few bits due to signal start-up delays as the signal propagates through a cabling system. Like the heat shield of a spacecraft, which protects the spacecraft from burning up during reentry, the preamble was originally developed as a shield to protect the bits in the rest of the frame when operating at 10 Mb/s.
The original 10 Mb/s cabling systems could include long stretches of coaxial cables, joined by signal repeaters. The preamble ensures that the entire path has enough time to start up, so that signals are received reliably for the rest of the frame.
The higher-speed Ethernet systems use more complex mechanisms for encoding the signals that avoid any signal start-up losses, and these systems don’t need a preamble to protect the frame signals. However, it is maintained for backward compatibility with the original Ethernet frame and to provide some extra timing for interframe housekeeping, as demonstrated, for example, in the 40 Gb/s system.
While there are differences in how the two standards formally defined the preamble bits, there is no practical difference between the DIX and IEEE preambles. The pattern of bits being sent is identical:
- DIX standard
- In the DIX standard, the preamble consists of eight “octets,” or 8-bit bytes. The first seven comprise a sequence of alternating ones and zeros. The eighth byte of the preamble contains 6 bits of alternating ones and zeros, but ends with the special pattern of “1, 1.” These two bits signal to the receiving interface that the end of the preamble has been reached, and that the bits that follow are the actual fields of the frame.
- IEEE standard
- In the 802.3 specification, the preamble field is formally divided into two parts consisting of seven bytes of preamble and one byte called the start frame delimiter (SFD). The last two bits of the SFD are 1, 1, as with the DIX standard.
The destination address field follows the preamble. Each Ethernet interface is assigned a unique 48-bit address, called the interface’s physical or hardware address. The destination address field contains either the 48-bit Ethernet address that corresponds to the address of the interface in the station that is the destination of the frame, a 48-bit multicast address, or the broadcast address.
Ethernet interfaces read in every frame up through at least the destination address field. If the destination address does not match the interface’s own Ethernet address, or one of the multicast or broadcast addresses that the interface is programmed to receive, then the interface is free to ignore the rest of the frame. Here is how the two standards implement destination addresses:
- DIX standard
- The first bit of the destination address, as sent onto the network medium, is used to distinguish physical addresses from multicast addresses. If the first bit is zero, then the address is the physical address of an interface, which is also known as a unicast address, because a frame sent to this address only goes to one destination. If the first bit of the address is a one, then the frame is being sent to a multicast address. If all 48 bits are ones, this indicates the broadcast, or all-stations, address.
- IEEE standard
- The IEEE 802.3 version of the frame adds significance to the second bit of the destination address, which is used to distinguish between locally and globally administered addresses. A globally administered address is a physical address assigned to the interface by the manufacturer, which is indicated by setting the second bit to zero. (DIX Ethernet addresses are always globally administered.) If the address of the Ethernet interface is administered locally for some reason, then the second bit is supposed to be set to a value of one. In the case of a broadcast address, the second bit and all other bits are ones in both the DIX and IEEE standards.
Locally administered addresses are rarely used on Ethernet systems, because each Ethernet interfaces is assigned its own unique 48-bit address at the factory. Locally administered addresses, however, were used on some other local area network systems.
Understanding physical addresses
In Ethernet, the 48-bit physical address is written as 12 hexadecimal digits with the digits paired in groups of two, representing an octet (8 bits) of information. The octet order of transmission on the Ethernet is from the leftmost octet (as written or displayed) to the rightmost octet. The actual transmission order of bits within the octet, however, goes from the least significant bit of the octet through to the most significant bit.
This means that an Ethernet address that is written as the hexadecimal string F0-2E-15-6C-77-9B is equivalent to the following sequence of bits, sent over the Ethernet channel from left to right: 0000 1111 0111 0100 1010 1000 0011 0110 1110 1110 1101 1001.
Therefore, the 48-bit destination address that begins with the hexadecimal value 0xF0 is a unicast address, because the first bit sent on the channel is a zero.
The next field in the frame is the source address. This is the physical address of the device that sent the frame. The source address is not interpreted in any way by the Ethernet MAC protocol, although it must always be the unicast address of the device sending the frame. It is provided for the use of high-level network protocols, and as an aid in troubleshooting. It is also used by switches to build a table associating source addresses with switch ports. An Ethernet station uses its physical address as the source address in any frame it transmits.
The DIX standard notes that a station can change the Ethernet source address, while the IEEE standard does not specifically state that an interface may have the ability to override the 48-bit physical address assigned by the manufacturer. However, all Ethernet interfaces in use these days appear to allow the physical address to be changed, which makes it possible for the network administrator or the high-level network software to modify the Ethernet interface address if necessary.
To provide the physical address used in the source address field, a vendor of Ethernet equipment acquires an organizationally unique identifier (OUI), which is a unique 24-bit identifier assigned by the IEEE. The OUI forms the first half of the physical address of any Ethernet interface that the vendor manufactures. As each interface is manufactured, the vendor also assigns a unique address to the interface using the second 24 bits of the 48-bit address space, and that, combined with the OUI, creates the 48-bit address. The OUI may make it possible to identify the vendor of the interface chip, which can sometimes be helpful when troubleshooting network problems.
The Q-tag is so called because it carries an 802.1Q tag, also known as a VLAN or priority tag. The 802.1Q standard defines a virtual LAN (VLAN) as one or more switch ports that function as a separate and independent Ethernet system on a switch. Ethernet traffic within a given VLAN (e.g., VLAN 100) will be sent and received only on those ports of the switch that are defined to be members of that particular VLAN (in this case, VLAN 100). A 4-byte-long Q-tag is inserted in an Ethernet frame between the source address and the length/type field to identify the VLAN to which the frame belongs. When a Q-Tag is present, the minimum data field size is reduced to 42 bytes, maintaining a minimum frame size of 64 bytes.
Switches can be connected together with an Ethernet segment that functions as a trunk connection that carries Ethernet frames with VLAN tags in them. That, in turn, makes it possible for Ethernet frames belonging to VLAN 100, for example, to be carried between multiple switches and sent or received on switch ports that are assigned to VLAN 100.
VLAN tagging, a vendor innovation, was originally accomplished using a variety of proprietary approaches. Development of the IEEE 802.1Q standard for virtual bridged LANs produced the VLAN tag as a vendor-neutral mechanism for identifying which VLAN a frame belongs to.
The addition of the 4-byte VLAN tag causes the maximum size of an Ethernet frame to be extended from the original maximum of 1,518 bytes (not including the preamble) to a new maximum of 1,522 bytes. Because VLAN tags are only added to Ethernet frames by switches and other devices that have been programmed to send and receive VLAN-tagged frames, this does not affect traditional, or “classic,” Ethernet operation.
The first two bytes of the Q-tag contain an Ethernet type identifier of 0x8100. If an Ethernet station that is not programmed to send or receive a VLAN tagged frame happens to receive a tagged frame, it will see what looks like a type identifier for an unknown protocol type and simply discard the frame. VLANs and the contents and organization of VLAN tags are described in Chapter 19.
Envelope Prefix and Suffix
As networks grew in complexity and features, the IEEE received requests for more tags to achieve new goals. The VLAN tag provided space for a VLAN ID and Class of Service (CoS) bits, but vendors and standards groups wanted to add extra tags to support new bridging features and other schemes.
To accommodate these requests, the 802.3 standards engineers defined an “envelope frame,” which adds an extra 482 bytes to the maximum frame size. The envelope frame was specified in the 802.3as supplement to the standard, adopted in 2006. In another change, the tag data was added to the data field to produce a MAC Client Data field. Because the MAC client data field includes the tagging fields, it may seem like the frame size definition has changed, but in fact this is just a way of referring to the combination of tag data and the data field for the purpose of defining the envelope frame.
The 802.3as supplement modified the standard to state that an Ethernet implementation should support at least one of three maximum MAC client data field sizes. The data field size continues to be defined as 46 to 1,500 bytes, but to that is added the tagging information to create the MAC client data field, resulting in the following MAC client data field sizes:
- 1,500-byte “basic frames” (no tagging information)
- 1,504-byte “Q-tagged frames” (1,500-byte data field plus 4-byte tag)
- 1,982-byte “envelope frames” (1,500-byte data field plus 482 bytes for all tags)
The standard notes that:
The envelope frame is intended to allow inclusion of additional prefixes and suffixes required by higher layer encapsulation protocols … such as those defined by the IEEE 802.1 working group (such as Provider Bridges and MAC Security), ITU-T or IETF (such as MPLS). The original MAC Client Data field maximum remains 1500 octets while the encapsulation protocols may add up to an additional 482 octets.
The contents of the tag space are not defined in the Ethernet standard, allowing maximum flexibility for the other standards to provide tags in Ethernet frames. Either or both prefix and suffix tags can be used in a given frame, occupying a maximum tag space of 482 bytes if either or both are present. This can result in a maximum frame size of 2,000 bytes.
The latest standard simply includes the Q-tag as one of the tags that can be carried in an envelope prefix. The standard notes, “All Q-tagged frames are envelope frames, but not all envelope frames are Q-tagged frames.” In other words, you can use the envelope space for any kind of tagging, and if you use a Q-tag, then it is carried in the envelope prefix as defined in the latest standard. An envelope frame carrying a Q-tag will have a minimum data size of 42 bytes, preserving the minimum frame size of 64 bytes.
Tagged frames are typically sent between switch ports that have been configured to add and remove tags as necessary to achieve their goals. Those goals can include VLAN operations and tagging a frame as a member of a given VLAN, or more complex tagging schemes to provide information for use by higher-level switching and routing protocols. Normal stations typically send basic Ethernet frames without tags, and will drop tagged frames that they are not configured to accept.
Type or Length Field
- DIX standard
- In the DIX Ethernet standard, this 16-bit field is called a type field, and it always contains an identifier that refers to the type of high-level protocol data being carried in the data field of the Ethernet frame. For example, the hexadecimal value 0x0800 has been assigned as the identifier for the Internet Protocol (IP). A DIX frame being used to carry an IP packet is sent with the value of 0x0800 in the type field of the frame. All IP packets are carried in frames with this value in the type field.
- IEEE standard
- When the IEEE 802.3 standard was first published in 1985, the type field was not included, and instead the IEEE specifications called this field a length field. Type fields were added to the IEEE 802.3 standard in 1997, so the use of a type field in the frame is officially recognized in 802.3. This change simply made the common practice of using the type field an official part of the standard. The identifiers used in the type field were originally assigned and maintained by Xerox, but with the type field now part of the IEEE standard, the responsibility for assigning type numbers was transferred to the IEEE.
In the IEEE 802.3 standard, this field is called a length/type field, and the hexadecimal value in the field indicates the manner in which the field is being used. The first octet of the field is considered the most significant octet in terms of numeric value.
If the value in this field is numerically less than or equal to 1,500 (decimal), then the field is being used as a length field. In that case, the value in the field indicates the number of logical link control (LLC) data octets that follow in the data field of the frame. If the number of LLC octets is less than the minimum required for the data field of the frame, then octets of padding data will automatically be added to make the data field large enough. The content of the padding data is unspecified by the standard. Upon reception of the frame, the length field is used to determine the length of valid data in the data field, and the padding data is discarded.
If the value in this field of the frame is numerically greater than or equal to 1,536 decimal (0x600 hex), then the field is being used as a type field.
The range of 1,501 to 1,535 was intentionally left undefined in the standard.
In that case, the hexadecimal identifier in the field is used to indicate the type of protocol data being carried in the data field of the frame. The network software on the station is responsible for providing any padding data required to ensure that the data field is 46 bytes in length. With this method, there is no conflict or ambiguity about whether the field indicates length or type.
- DIX standard
- In a DIX frame, this field must contain a minimum of 46 bytes of data, and may range up to a maximum of 1,500 bytes of data. The network protocol software is expected to provide at least 46 bytes of data.
- IEEE standard
- The total size of the data field in an IEEE 802.3 frame is the same as in a DIX frame: a minimum of 46 bytes and a maximum of 1,500. However, a logical link control protocol defined in the IEEE 802.2 LLC standard may ride in the data field of the 802.3 frame to provide control information. The LLC protocol is also used as a way to identify the type of protocol data being carried by the frame if the type/length field is used for length information. The LLC protocol data unit (PDU) is carried in the first set of bytes in the data field of the IEEE frame. The structure of the LLC PDU is defined in the IEEE 802.2 LLC standard.
The process of figuring out which protocol software stack gets the data in an incoming frame is known as demultiplexing. An Ethernet frame may use the type field to identify the high-level protocol data being carried by the frame. In the LLC specification, the receiving station demultiplexes the frame by deciphering the contents of the logical link control protocol data unit. These issues are described in more detail later in this chapter.
The last field in both the DIX and IEEE frames is the frame check sequence (FCS) field, also called the cyclic redundancy check (CRC). This 32-bit field contains a value that is used to check the integrity of the various bits in the frame fields (not including the preamble/SFD). This value is computed using the CRC, a polynomial that is calculated using the contents of the destination, source, type (or length), and data fields. As the frame is generated by the transmitting station, the CRC value is simultaneously being calculated. The 32 bits of the CRC value that are the result of this calculation are placed in the FCS field as the frame is sent. The x31 coefficient of the CRC polynomial is sent as the first bit of the field, and the x0 coefficient as the last.
The CRC is calculated again by the interface in the receiving station as the frame is read in. The result of this second calculation is compared with the value sent in the FCS field by the originating station. If the two values are identical, then the receiving station is provided with a high level of assurance that no errors have occurred during transmission over the Ethernet channel. If the values are not identical, then the interface can discard the frame and increment the frame error counter.
End of Frame Detection
The presence of a signal on the Ethernet channel is known as carrier. The transmitting interface stops sending data after the last bit of a frame is transmitted, which causes the Ethernet channel to become idle. In the original 10 Mb/s system, the loss of carrier when the channel goes idle signals to the receiving interface that the frame has ended. When the interface detects loss of carrier, it knows that the frame transmission has come to an end. The higher-speed Ethernet systems use more complex signal encoding schemes, which have special symbols available for signaling to the interface the start and end of a frame.
A basic frame carrying a maximum data field of 1,500 bytes is actually 1,518 bytes in length (not including the preamble) when the 18 bytes needed for the addresses, length/type field, and the frame check sequence are included. The addition of a further 482 bytes for envelope frames makes the maximum frame size become 2,000 bytes. This was chosen as a useful maximum frame size that could be handled by a typical Ethernet implementation in an interface or switch port, while providing enough room for current and future prefixes and suffixes.
Full-Duplex Media Access Control
The full-duplex mode of operation was added to the standard in 1997, to allow simultaneous communication between a pair of stations over a link. The link between the stations must be composed of a point-to-point media segment, such as twisted-pair or fiber optic media, that provides independent transmit and receive data paths. In full-duplex mode, both stations can simultaneously transmit and receive, which doubles the aggregate capacity of the link. For example, a half-duplex Fast Ethernet twisted-pair segment provides a maximum of 100 Mb/s of bandwidth. When operated in full-duplex mode, the same 100BASE-TX twisted-pair segment can provide a total aggregate bandwidth of 200 Mb/s.
Another major advantage of full-duplex operation is that the maximum segment length is no longer limited by the timing requirements of the original shared-channel half-duplex Ethernet system. In full-duplex mode, the only limits are those set by the signal-carrying capabilities of the media segment. This is especially useful for fiber optic segments, allowing those segments to span long distances.
The full-duplex mode was specified in the 802.3x supplement to the standard. This supplement was approved for adoption into the IEEE 802.3 standard in March 1997. The 802.3x supplement also describes an optional set of mechanisms used for flow control over full-duplex links. The mechanisms used to establish flow control are called MAC control and PAUSE. First we’ll describe how full-duplex mode works, and then we’ll show how the MAC control and PAUSE mechanisms can be used to provide flow control over a full-duplex link.
- The media system must have independent transmit and receive data paths that can operate simultaneously.
- Exactly two stations can be connected by any full-duplex point-to-point link. There is no contention for use of a shared medium, so the multiple access algorithm (i.e., CSMA/CD) is unnecessary and is not used.
- Both stations on the network link must be capable of, and have been configured to use, the full-duplex mode of operation. This means that both Ethernet interfaces must have the capability to simultaneously transmit and receive frames.
Figure 4-2 shows two stations simultaneously sending and receiving over a full-duplex link segment. The segment provides independent data paths so that both stations can be active without interfering with one another’s transmissions.
When sending a frame in full-duplex mode, the station does not defer to traffic being received on the channel. However, the station still waits for an interframe gap period between frame transmissions, as Ethernet interfaces are designed to expect a gap between successive frames. Providing the interframe gap ensures that the interfaces at each end of the link can keep up with the full frame rate of the link.
A station on a full-duplex link transmits whenever it wishes to, without respect to carrier sense (CS), which indicates frames being received from the other station on the receive side of the link segment. There is no multiple access (MA), as there is only one station at each end of the link and the Ethernet channel between them is not the subject of access contention by multiple stations. Because there is no access contention, there will be no collisions either, so the stations at each end of the link also ignore collision detection (CD), which indicates frame reception while transmitting.
Effects of Full-Duplex Operation
While full-duplex operation has the potential to double the bandwidth of an Ethernet link segment, it usually won’t result in a large increase in performance on a link that connects to a user’s computer. That’s because few applications send and receive the same amount of data simultaneously. Instead, many applications send some data (e.g., the data resulting from a web click) and then wait for a response. This leads to asymmetric data patterns, in which data that is making requests is sent in one direction, and then larger amounts of data return with the response, often including text, images, or video streams.
On the other hand, full-duplex links between switches in a network backbone system will typically carry multiple conversations between many computers. Therefore, the aggregated traffic on backbone channels will be more symmetric, with both transmit and receive channels seeing roughly the same amount of traffic. For that reason, the largest benefits of a full-duplex bandwidth increase are usually seen in backbone links.
Configuring Full-Duplex Operation
To ensure correct configuration of the Ethernet interfaces at each end of a link, the standard recommends that Ethernet Auto-Negotiation (see Chapter 5) be used whenever possible to automatically configure full-duplex mode. The vast majority of twisted-pair Ethernet interfaces and switch ports support Auto-Negotiation, which will automatically support the highest-performance mode of operation between two stations on a link segment.
It is essential that both ends of a link operating in full-duplex mode are configured correctly, or the link will have data errors. However, using Auto-Negotiation to configure full-duplex operation on a link may not be as simple as it sounds. For one thing, support for Auto-Negotiation is optional for some Ethernet media systems, in which case the vendor is not required to provide Auto-Negotiation capability.
Auto-Negotiation was originally developed for twisted-pair Ethernet devices only, and after the original development of 10BASE-T; thus, it is not supported on all Ethernet media types or older 10BASE-T systems. The older 10 Mb/s and 100 Mb/s fiber optic media systems also do not support the Auto-Negotiation standard, while Gigabit Ethernet fiber optic systems have their own auto-configuration scheme. Therefore, you may find that you have to manually configure full-duplex support on the stations at each end of the link.
On a manually configured link, if only one end of the link is in full-duplex mode and the other is in half-duplex mode, then the half-duplex end of the link will lose frames due to errors, such as late collisions. Data will still flow across the link, but as the full-duplex end will be sending data whenever it pleases, it will not be obeying the same CSMA/CD rules as the half-duplex end. Because the misconfigured link will still support the flow of data (despite the errors), it is possible that this problem may not be detected right away. Therefore, you need to be aware that this condition can occur, and make sure that both ends of a manually configured link are set for the same mode of operation.
Full-Duplex Media Support
Table 4-1 provides a list of copper Ethernet media systems, and indicates which ones can support the full-duplex mode of operation.
|Media system||Cable type||Full-duplex support?|
50 ohm thick coaxial cable
50 ohm thin coaxial cable
75 ohm coaxial cable
2 multimode optical fibers
2 multimode or single-mode optical fibers
2-pair shielded twisted-pair
Short-range twinaxial cables
Short-range twinaxial cables
Full-Duplex Media Segment Distances
When a segment is operating in full-duplex mode, CSMA/CD-based MAC operation is disabled. As a result, the cable length limits imposed by the round-trip timing constraints of the CSMA/CD algorithm no longer exist. In the absence of a round-trip timing limit imposed by the CSMA/CD MAC algorithm, the only constraint on cable length is the one imposed by the signal transmission characteristics of the cable. For that reason, some full-duplex segments can be much longer than the same segments operating in half-duplex mode.
For twisted-pair cabling, it is the signal-carrying characteristics of the wires that limit segment length. The 10/100/1000BASE-T and 10GBASE-T media systems have a maximum cabling distance recommendation of 100 meters (328 feet) for twisted-pair cable. This limit is the same whether the segment is operated in full-duplex or half-duplex mode.
Fiber optic segments, with their excellent signal-carrying characteristics, are mostly limited in length by the timing constraints of half-duplex operation. For that reason, a full-duplex mode fiber optic segment can be considerably longer than the same segment type operating in half-duplex mode. As an example, a 100BASE-FX fiber optic segment using a typical multimode fiber optic cable is limited to segment lengths of 412 meters (1351.6 feet) in half-duplex mode. However, the same media system can reach as far as 2 kilometers (6561.6 feet) when operated in full-duplex mode.
Single-mode fiber optic media can carry signals over longer distances than multimode fiber. Therefore, a full-duplex fiber link can work over considerably longer distances if single-mode fiber is used. In the case of a 100BASE-FX link, single-mode fiber can provide link distances of 20 kilometers (12.42 miles) or more. For full-duplex links, you need to consult the equipment vendor for specifications on the maximum length of the segment.
Ethernet Flow Control
Ethernet flow control is a mechanism that allows an interface or switch port to send a signal requesting a short pause in frame transmission. At the time that this feature was developed, vendors were implementing various approaches to controlling Ethernet frame transmission, in an attempt to manage limited switch and interface resources on busy networks. To provide a vendor-neutral way to signal a request for a brief pause in frame transmission, an explicit flow control message is provided by the optional MAC control and PAUSE specifications in the 802.3x full-duplex supplement.
Today, switch and interface resources are no longer as limited as they once were, and while Ethernet flow control is implemented by vendors, it is not widely used for its original purpose. Instead, you will find PAUSE-based flow control used in data center switch implementations, for example, to provide quality of service for file storage data flows.
The optional MAC control portion of the 802.3x supplement provides a mechanism for real-time control and manipulation of the frame transmission and reception process in an Ethernet station. In normal Ethernet operation, the Media Access Control (MAC) protocol defines how to go about transmitting and receiving frames. In the Ethernet flow control system, the MAC control protocol provides mechanisms to control when Ethernet frames are sent.
The MAC control system provides a way for the station to receive a MAC control frame and act upon it. The operation of the MAC control system is transparent to the normal media access control functions in a station. MAC control is not used for non-real-time functions, such as configuring interfaces, that are handled by network management mechanisms. Instead, MAC control is designed to allow stations to interact in real time to control the flow of traffic. The specification allows for new functions beyond flow control to be added in the future.
MAC control frames are identified with a type value of 0x8808 (hex). A station equipped with optional MAC control receives all frames using the normal Ethernet MAC functions, and then passes the frames to the MAC control software for interpretation. If the frame contains the hex value 0x8808 in the type field, then the MAC control function reads the frame, looking for MAC control operation codes carried in the data field. If the frame does not contain the 0x8808 value in the type field, then MAC control takes no action, and the frame is passed along to the normal frame reception software on the station.
MAC control frames contain operation codes (opcodes) in the data field of the frame. The frame size is fixed at the minimum frame size allowed in the standard, with 46 bytes in the data field. The opcode is contained in the first two bytes of the data field. There is no reliable transport mechanism, so MAC control must be able to deal with the fact that MAC control frames may be lost, discarded, damaged, or delayed.
The PAUSE system of flow control on full-duplex link segments, originally defined in 802.3x, uses MAC control frames to carry the PAUSE commands. The MAC control opcode for a PAUSE command is 0x0001 (hex). A station that receives a MAC control frame with this opcode in the first two bytes of the data field knows that the control frame is being used to implement the PAUSE operation, for the purpose of providing flow control on a full-duplex link segment. Only stations configured for full-duplex operation may send PAUSE frames.
“PAUSE” is not an acronym. Instead, PAUSE is written in uppercase letters to indicate that the word is a formally defined function in the MAC control standard. This is common practice for formally defined words and phrases in the standard.
When a station equipped with MAC control wishes to send a PAUSE command, it sends a PAUSE frame to the 48-bit destination multicast address of 01-80-C2-00-00-01. This particular multicast address has been reserved for use in PAUSE frames. Having a well-known multicast address simplifies the flow control process by making it unnecessary for a station at one end of the link to discover and store the address of the station at the other end of the link.
Another advantage of using this multicast address arises from the use of flow control on full-duplex segments between switches. The particular multicast address used was selected from a range of addresses reserved by the IEEE 802.1D standard, which specifies basic Ethernet switch (bridge) operation. Normally, a frame with a multicast destination address that is sent to a switch will be forwarded out all other ports of the switch. However, this range of multicast addresses is special—they will not be forwarded by an 802.1D-compliant switch. Instead, frames sent to these addresses are understood by the switch to be frames meant to be acted upon within the switch.
A station sending a PAUSE frame to the special multicast address includes not only the PAUSE opcode, but also the period of pause time being requested, in the form of a two-byte integer. This number contains the length of time for which the receiving station is requested to stop transmitting data. The pause time is measured in units of pause “quanta,” where each unit is equal to 512 bit times. The range of possible pause time requests is from 0 through 65,535 units.
Figure 4-3 shows what a PAUSE frame looks like. The PAUSE frame is carried in the data field of the MAC control frame. The MAC control opcode of 0x0001 indicates that this is a PAUSE frame. The PAUSE frame carries a single parameter, defined as the pause_time in the standard. In this example, the content of pause_time is 2, indicating a request that the device at the other end of the link stop transmitting for a period of two pause quantas (1,024 bit times total).
By using MAC control frames to send PAUSE requests, a station at one end of a full-duplex link can request the station at the other end of the link to stop transmitting frames for a period of time. This provides real-time flow control between switches, or between a switch and a server that are equipped with the optional MAC control software and connected by a full-duplex link.
High-Level Protocols and the Ethernet Frame
The process of identifying which high-level network protocol data is being carried in the data field of an Ethernet frame is called multiplexing. In multiplexing, multiple sources of information can be carried over a single system. In this case, multiple high-level protocols can be sent over the same Ethernet system in separate Ethernet frames.
Multiplexing Data in Frames
The original system of multiplexing for Ethernet is based on using the type field in the Ethernet frame. For example, the high-level protocol software on a computer can create a packet of IP data, and then hand the packet to software that understands how to create Ethernet frames with type fields. The software inserts a hexadecimal value into the type field of the frame; this value corresponds to the type of high-level protocol being carried by the frame. It then hands the data to the interface driver software for transmission over the Ethernet.
The Ethernet interface driver software deals with the details of interacting with the Ethernet interface to send the frame over the Ethernet channel. When carrying IP packets, the type field will be assigned the hexadecimal value 0x0800. The receiving station then uses the value in the type field to identify the protocol data being carried, and thus demultiplex the received frame.
Each layer of the network system is substantially independent from the other layers. Encapsulating the data being passed between layers helps maintain independence between the layers, making it possible for a complex system of network software to be broken down into more manageable chunks. By providing standardized operating system interfaces to the network programmers, the complexity of each network layer is effectively hidden from view.
The programmer is free to write software that hands the completed high-level protocol packet to the appropriate computer system software interface. The details of placing the protocol packet into the data field of an Ethernet frame are automatically dealt with. In this way, an IP-based application, and the IP software itself, can function without major changes regardless of which physical network system the computer happens to be attached to.
Things are made somewhat more complex because of the presence of two methods of identifying data in a frame: one using a type field to identify data, and one using the IEEE 802.2 logical link control (LLC) standard. However, many network drivers are capable of identifying and dealing with multiple frame formats.
IEEE Logical Link Control
As we’ve seen, the value of the identifier in the length/type field determines which way the field is being used. When used as a length field, the task of identifying the type of high-level protocol being carried in the frame is moved to the 802.2 LLC fields carried in the first few bytes of the data field. Let’s look at the LLC fields in a little more detail.
Figure 4-4 shows an IEEE 802.2 LLC protocol data unit, or PDU. The LLC PDU contains a destination service access point (DSAP), which identifies the high-level protocol that the data in the frame is intended for, much like the type field does. After a source service access point (SSAP) and some control data, the actual user data (the data that makes up the high-level protocol packet) follows the LLC fields.
When network protocol software uses the 802.2 LLC fields, multiplexing and demultiplexing work in the same way that they do for a frame with a type field. The difference is that the identification of the type of high-level protocol data is shifted to the DSAP, which is located in the LLC PDU. The whole LLC PDU fits inside the first few bytes of the data field of the Ethernet frame. In frames carrying LLC fields, the actual amount of high-level protocol data that can be carried is a few bytes less than in frames that use a type field.
You may be wondering why the IEEE went to all the trouble of defining the 802.2 LLC protocol to provide multiplexing when the type field seems to be able to do the job just as well. The reason is that the IEEE 802 committee was created to standardize a set of LAN technologies, and not just the 802.3 Ethernet system. To do that, they needed something that would work no matter which LAN technology was in use.
Because there was no guarantee that all LAN frames would have a type field, the IEEE 802 committee provided the LLC protocol as a method of identifying the type of data being carried by the frame. All LAN systems have a data field, so it is easy enough to write network protocol software that can look at the first few bytes of data in the data field, and then interpret that data in terms of the LLC specifications.
The LLC Sub-Network Access Protocol
Just to make things more interesting, the 802.2 LLC protocol can also be used to carry the original Ethernet type identifiers. In other words, when you send a frame on a non-Ethernet LAN technology that does not provide a type field in its frame, there’s a way to use the LLC fields to provide a type identifier. The rationale for this approach comes from the fact that the LLC fields are not large. Given that limitation, the IEEE didn’t want to use up the limited number of bits in the LLC fields to provide identifiers for the older high-level protocol types. Instead, a method was created to preserve the existing set of high-level protocol type identifiers, and to reuse them in the IEEE LLC system.
This approach, known as LLC Sub-Network Access Protocol (SNAP) encapsulation, provides yet another set of bytes in the data field of the frame. The contents of the LLC fields of the frame are used to identify another set of bits in the data field, organized according to the SNAP specification, and the SNAP fields are used to carry the older protocol type identifiers. The standard for the use of SNAP encapsulation via IP is documented in RFC 1042. (RFCs can be found at http://tools.ietf.org; for more information, see Appendix A.)
If you’re writing network protocol software, then SNAP encapsulation is a handy way to continue using the same high-level protocol type identifiers when sending frames over other LAN systems. In the Ethernet system itself, of course, TCP/IP protocol software simply uses the type field, and you don’t need to concern yourself with any of this. However, you will probably encounter SNAP encapsulation if you deal with multiple LAN systems at this level of detail.
As a network user, you don’t need to lose sleep over which frame format your computers may be using. The choice of frame format is built into your networking software, and there’s nothing you need to do about it.
 IEEE Std 802.3-2012, paragraph 3.2.7, Note 1, p. 56.