book

Understanding Linux Network Internals

by Christian Benvenuti

December 2005

Intermediate to advanced

1066 pages

33h 38m

English

O'Reilly Media, Inc.

Read now

Unlock full access

The Audience for This Book
What Is Not Covered

1.1. Basic Terminology
1.2.1. Memory Caches1.2.2. Caching and Hash Tables1.2.3. Reference Counts1.2.4. Garbage Collection1.2.5. Function Pointers and Virtual Function Tables (VFTs)1.2.6. goto Statements1.2.7. Vector Definitions1.2.8. Conditional Directives (#ifdef and family)1.2.9. Compile-Time Optimization for Condition Checks1.2.10. Mutual Exclusion1.2.11. Conversions Between Host and Network Order1.2.12. Catching Bugs1.2.13. Statistics1.2.14. Measuring Time
1.4.1. Dead Code
2.1. The Socket Buffer: sk_buff Structure2.1.1. Networking Options and Kernel Structures2.1.2. Layout Fields2.1.3. General Fields2.1.4. Feature-Specific Fields2.1.5. Management Functions2.1.5.1. Allocating memory: alloc_skb and dev_alloc_skb2.1.5.2. Freeing memory: kfree_skb and dev_kfree_skb2.1.5.3. Data reservation and alignment: skb_reserve, skb_put, skb_push, and skb_pull2.1.5.4. The skb_shared_info structure and the skb_shinfo function2.1.5.5. Cloning and copying buffers2.1.5.6. List management functions
2.2.1. Identifiers2.2.2. Configuration2.2.2.1. Interface types and ports2.2.2.2. Promiscuous mode2.2.3. Statistics2.2.4. Device Status2.2.5. List Management2.2.6. Link Layer Multicast2.2.7. Traffic Management2.2.8. Feature Specific2.2.9. Generic2.2.10. Function Pointers
3.1. Overview
3.2.1. procfs3.2.2. sysctl: Directory /proc/sys3.2.2.1. Examples of ctl_table initialization3.2.2.2. Registering a file in /proc/sys3.2.2.3. Core networking files and directories
4.1. Reasons for Notification Chains
4.6.1. Wrappers4.6.2. Examples
5.1. System Initialization Overview
5.4.1. Hardware Interrupts5.4.1.1. Interrupt types5.4.1.2. Interrupt sharing5.4.1.3. Organization of IRQs to handler mappings
5.7.1. Legacy Code
5.8.1. kmod5.8.2. Hotplug5.8.2.1. /sbin/hotplug
5.9.1. Examples of Virtual Devices5.9.2. Interaction with the Kernel Network Stack
6.1. Data Structures Featured in This Chapter
7.1. Boot-Time Kernel Options7.1.1. Registering a Keyword7.1.2. Two-Pass Parsing7.1.3. .init.setup Memory Section7.1.4. Use of Boot Options to Configure Network Devices
7.2.1. Old Model: Conditional Code7.2.2. New Model: Macro-Based Tagging
7.3.1. Initialization Macros for Device Initialization Routines
7.4.1. xxx_initcall Macros7.4.1.1. Example of _ _initcall and _ _exitcall routines: modules7.4.1.2. Example of dependency between initialization routines7.4.1.3. Legacy code
7.5.1. _ _init and _ _exit Macros7.5.2. xxx_initcall and _ _exitcall Sections7.5.3. Other Optimizations7.5.4. Dynamic Macros' Definition
8.1. When a Device Is Registered
8.5.1. Device Driver Initializations8.5.2. Device Type Initialization: xxx_setup Functions8.5.3. Optional Initializations and Special Cases
8.6.1. Lookups
8.7.1. Queuing Discipline State8.7.2. Registration State
8.8.1. Split Operations: netdev_run_todo8.8.2. Device Registration Status Notification8.8.2.1. netdev_chain notification chain8.8.2.2. RTnetlink link notifications
8.9.1. register_netdevice Function
8.10.1. unregister_netdevice Function8.10.2. Reference Counts8.10.2.1. Function netdev_wait_allrefs
8.12.1. Interactions with Power Management8.12.1.1. Suspending a device8.12.1.2. Resuming a device8.12.2. Link State Change Detection8.12.2.1. Scheduling and processing link state change events8.12.2.2. Linkwatch flags
8.13.1. Ethtool8.13.1.1. Drivers that do not support ethtool8.13.2. Media Independent Interface (MII)
9.1. Decisions and Traffic Direction
9.2.1. Polling9.2.2. Interrupts9.2.3. Processing Multiple Frames During an Interrupt9.2.4. Timer-Driven Interrupts9.2.5. Combinations9.2.6. Example
9.3.1. Reasons for Bottom Half Handlers9.3.2. Bottom Halves Solutions9.3.3. Concurrency and Locking9.3.4. Preemption9.3.5. Bottom-Half Handlers9.3.5.1. Bottom-half handlers in kernel 2.29.3.5.2. Bottom-half handlers in kernel 2.4 and above: the introduction of the softirq9.3.6. Tasklets9.3.7. Softirq Initialization9.3.8. Pending softirq Handling9.3.8.1. _ _do_softirq function9.3.9. Per-Architecture Processing of softirq9.3.10. ksoftirqd Kernel Threads9.3.10.1. Starting the threads9.3.11. Tasklet Processing9.3.12. How the Networking Code Uses softirqs
9.4.1. Fields of softnet_data9.4.2. Initialization of softnet_data
10.1. Interactions with Other Features
10.4.1. Introduction to the New API (NAPI)10.4.2. net_device Fields Used by NAPI10.4.3. net_rx_action and NAPI10.4.4. Old Versus New Driver Interfaces10.4.5. Manipulating poll_list
10.5.1. Initial Tasks of netif_rx10.5.2. Managing Queues and Scheduling the Bottom Half
10.6.1. Congestion Management in netif_rx10.6.2. Average Queue Length and Congestion-Level Computation
10.7.1. Backlog Processing: The process_backlog Poll Virtual Function10.7.2. Ingress Frame Processing10.7.2.1. Handling special features
11.1. Enabling and Disabling Transmissions11.1.1. Scheduling a Device for Transmission11.1.2. Queuing Discipline Interface11.1.2.1. qdisc_restart function11.1.3. dev_queue_xmit Function11.1.3.1. Queueful devices11.1.3.2. Queueless devices11.1.4. Processing the NET_TX_SOFTIRQ: net_tx_action11.1.4.1. Watchdog timer
12.1. Statistics
13.1. Overview of Network Stack13.1.1. The Big Picture13.1.2. Link Layer Choices for Ethernet (LLC and SNAP)13.1.3. How the Network Stack Operates
13.2.1. Special Media Encapsulation
13.5.1. Setting the Packet Type13.5.2. Setting the Ethernet Protocol and Length13.5.3. Logical Link Control (LLC)13.5.3.1. The IPX case13.5.3.2. Linux's LLC implementation13.5.3.3. Processing ingress LLC frames13.5.4. Subnetwork Access Protocol (SNAP)
14.1. Repeaters, Bridges, and Routers
14.6.1. Broadcast and Multicast Addresses14.6.2. Aging
14.7.1. Bridging Loops14.7.2. Loop-Free Topologies14.7.3. Defining a Loop-Free Topology
15.1. Basic Terminology
15.3.1. Root Bridge15.3.2. Designated Bridges15.3.3. Spanning Tree Ports15.3.3.1. Port states15.3.3.2. Port roles
15.5.1. Configuration BPDU15.5.2. Priority Vector15.5.3. When to Transmit Configuration BPDUs15.5.4. BPDU Aging
15.6.1. Root Bridge Selection15.6.2. Root Port Selection15.6.3. Designated Port Selection15.6.4. Examples of STP in Action
15.7.1. Avoiding Temporary Loops
15.8.1. Short Aging Timer15.8.2. Letting All Bridges Know About a Topology Change15.8.3. Example of a Topology Change
15.11.1. Ingress BPDUs15.11.2. Ingress Configuration BPDUs
15.13.1. Rapid Spanning Tree Protocol (RSTP)15.13.2. Multiple Spanning Tree Protocol (MSTP)
16.1. Bridge Device Abstraction
16.8.1. Deleting a Bridge Port
16.13.1. Lookups16.13.2. Reference Counts16.13.3. Adding, Updating, and Removing Entries16.13.4. Aging
16.14.1. Data Frames Versus BPDUs16.14.2. Processing Data Frames
16.16.1. Key Spanning Tree Routines16.16.2. Bridge IDs and Port IDs16.16.3. Enabling the Spanning Tree Protocol on a Bridge Device16.16.4. Processing Ingress BPDUs16.16.5. Transmitting BPDUs16.16.6. Configuration Updates16.16.7. Root Bridge Selection16.16.7.1. Becoming the root bridge16.16.7.2. Giving up the root bridge role16.16.8. Timers16.16.9. Handling Topology Changes
17.1. User-Space Configuration Tools17.1.1. Handling Configuration Changes17.1.2. Old Interface Versus New Interface17.1.3. Creating Bridge Devices and Bridge Ports17.1.4. Configuring Bridge Devices and Ports
17.5.1. bridge_id Structure17.5.2. net_bridge_fdb_entry Structure17.5.3. net_bridge_port Structure17.5.4. net_bridge Structure
18.1. IP Protocol: The Big Picture
18.3.1. "End of Option List" and "No Operation" Options18.3.2. Source Route Option18.3.3. Record Route Option18.3.4. Timestamp Option18.3.5. Router Alert Option
18.4.1. Effect of Fragmentation on Higher Layers18.4.2. IP Header Fields Used by Fragmentation/Defragmentation18.4.3. Examples of Problems with Fragmentation/Defragmentation18.4.3.1. Retransmissions18.4.3.2. Associating fragments with their IP packets18.4.3.3. Example of IP ID generation18.4.3.4. Example of unsolvable defragmentation problem: NAT18.4.4. Path MTU Discovery
18.5.1. APIs for Checksum Computation18.5.2. Changes to the L4 Checksum
19.1. Main IPv4 Data Structures19.1.1. Checksum-Related Fields from sk_buff and net_device Structures19.1.1.1. net_device structure19.1.1.2. sk_buff structure
19.2.1. Protocol Initialization19.2.2. Interaction with Netfilter19.2.3. Interaction with the Routing Subsystem19.2.4. Processing Input IP Packets19.2.5. The ip_rcv_finish Function
19.3.1. Option Processing19.3.2. Option Parsing19.3.2.1. Option: strict and loose Source Routing19.3.2.2. Option: Record Route19.3.2.3. Option: Timestamp19.3.2.4. Option: Router Alert19.3.2.5. Handling parsing errors
20.1. Forwarding20.1.1. ICMP Redirect20.1.2. ip_forward Function20.1.3. ip_forward_finish Function20.1.4. dst_output Function
21.1. Key Functions That Perform Transmission21.1.1. Multicast Traffic21.1.2. Relevant Socket Data Structures for Local Traffic21.1.3. The ip_queue_xmit Function21.1.3.1. Setting the route21.1.3.2. Building the IP header21.1.4. The ip_append_data Function21.1.4.1. Basic memory allocation and buffer organization for ip_append_data21.1.4.2. Memory allocation and buffer organization for ip_append_data with Scatter Gather I/O21.1.4.3. Key routines for handling fragmented buffers21.1.4.4. Further handling of the buffers21.1.4.5. Setting the context21.1.4.6. Getting ready for fragment generation21.1.4.7. Copying data into the fragments: getfrag21.1.4.8. Buffer allocation21.1.4.9. Main loop21.1.4.10. L4 checksum21.1.5. The ip_append_page Function21.1.6. The ip_push_pending_frames Function21.1.7. Putting Together the Transmission Functions21.1.8. Raw Sockets
22.1. IP Fragmentation22.1.1. Functions Involved with IP Fragmentation22.1.2. The ip_fragment Function22.1.3. Slow Fragmentation22.1.4. Fast Fragmentation
22.2.1. Organization of the IP Fragments Hash Table22.2.2. Key Issues in Defragmentation22.2.3. Functions Involved with Defragmentation22.2.4. New ipq Instance Initialization22.2.5. The ip_defrag Function22.2.6. The ip_frag_queue Function22.2.6.1. Handling overlaps22.2.6.2. L4 checksum22.2.7. Garbage Collection22.2.8. Hash Table Reorganization
23.1. Long-Living IP Peer Information23.1.1. Initialization23.1.2. Lookups23.1.3. How the IP Layer Uses inet_peer Structures23.1.4. Garbage Collection
23.4.1. Main Functions That Manipulate IP Addresses and Configuration23.4.2. Change Notification: rtmsg_ifa23.4.3. inetaddr_chain Notification Chain23.4.4. IP Configuration via ip23.4.5. IP Configuration via ifconfig
23.8.1. iphdr Structure23.8.2. ip_options Structure23.8.3. ipcm_cookie Structure23.8.4. ipq Structure23.8.5. inet_peer Structure23.8.6. ipstats_mib Structure23.8.7. in_device Structure23.8.8. in_ifaddr Structure23.8.9. ipv4_devconf Structure23.8.10. ipv4_config Structure23.8.11. cork Structure23.8.12. skb_frag_t Structure
24.1. Available L4 Protocols
24.2.1. Registration: inet_add_protocol and inet_del_protocol
24.3.1. Raw Sockets and Raw IP24.3.2. Delivering Raw Input Datagrams to the Recipient Application24.3.3. IPsec
25.1. ICMP Header
25.3.1. ICMP_ECHO and ICMP_ECHOREPLY25.3.2. ICMP_DEST_UNREACH25.3.3. ICMP_SOURCE_QUENCH25.3.4. ICMP_REDIRECT25.3.5. ICMP_TIME_EXCEEDED25.3.6. ICMP_PARAMETERPROB25.3.7. ICMP_TIMESTAMP and ICMP_TIMESTAMPREPLY25.3.8. ICMP_INFO_REQUEST and ICMP_INFO_REPLY25.3.9. ICMP_ADDRESS and ICMP_ADDRESSREPLY
25.4.1. ping25.4.2. traceroute
25.7.1. icmphdr Structure25.7.2. icmp_control Structure25.7.3. icmp_bxm Structure
25.8.1. Transmitting ICMP Error Messages25.8.2. Replying to Ingress ICMP Messages25.8.3. Rate Limiting25.8.4. Implementation of Rate Limiting25.8.5. Receiving ICMP Messages25.8.6. Processing ICMP_ECHO and ICMP_ECHOREPLY Messages25.8.7. Processing the Common ICMP Messages25.8.8. Processing ICMP_REDIRECT Messages25.8.9. Processing ICMP_TIMESTAMP and ICMP_TIMESTAMPREPLY Messages25.8.10. Processing ICMP_ADDRESS and ICMP_ADDRESSREPLY Messages
26.1. What Is a Neighbor?
26.2.1. When L3 Addresses Need to Be Translated to L2 Addresses26.2.2. Shared Medium26.2.3. Why Static Assignment of Addresses Is Not Sufficient26.2.4. Special Cases26.2.5. Solicitation Requests and Replies
26.3.1. Neighboring Protocols
26.4.1. Conditions Required by the Proxy
26.6.1. Reachability26.6.2. Transitions Between NUD States26.6.2.1. Basic states26.6.2.2. Derived states26.6.2.3. Initial state26.6.3. Reachability Confirmation
27.1. Main Data Structures
27.2.1. Initialization of neigh->ops27.2.2. Initialization of neigh->output and neigh->nud_state27.2.2.1. Common state changes: neigh_connect and neigh_suspect27.2.2.2. Routines used for neigh->output27.2.3. Updating a Neighbor's Information: neigh_update27.2.3.1. neigh_update optimization27.2.3.2. Initial neigh_update operations27.2.3.3. Changes of link layer address27.2.3.4. Notifications to arpd
27.3.1. Caching27.3.2. Timers
27.5.1. The neigh_create Function's Parameters27.5.2. Neighbor Initialization
27.6.1. Garbage Collection27.6.1.1. Synchronous cleanup: the neigh_forced_gc function27.6.1.2. Asynchronous cleanup: the neigh_periodic_timer function
27.7.1. Delayed Processing of Solicitation Requests27.7.2. Per-Device Proxying and Per-Destination Proxying
27.8.1. Methods Provided by the Device Driver27.8.2. Link Between Routing and L2 Header Caching27.8.3. Cache Invalidation and Updating
27.10.1. Events Generated by the Neighboring Layer27.10.2. Events Received by the Neighboring Layer27.10.2.1. Updates via neigh_ifdown27.10.2.2. Updates via neigh_changeaddr (netdevice notification chain)
27.12.1. Ingress Queuing27.12.2. Egress Queuing
28.1. ARP Packet Format28.1.1. Destination Address Types for ARP Packets
28.3.1. Change of L2 Address28.3.2. Duplicate Address Detection28.3.3. Virtual IP
28.5.1. Compile-Time Options28.5.2. /proc Options28.5.2.1. ARP_ANNOUNCE28.5.2.2. ARP_IGNORE28.5.2.3. ARP_FILTER28.5.2.4. Medium ID
28.6.1. The arp_tbl Table
28.7.1. Basic Initialization Sequence28.7.2. Virtual Functions in the ops Field28.7.3. Start of the arp_constructor Function28.7.4. Devices That Do Not Need ARP28.7.5. Devices That Need ARP
28.8.1. Transmitting ARP Packets: Introduction to arp_send28.8.2. Solicitations28.8.2.1. ARP_ANNOUNCE and selection of source IP address
28.9.1. Initial Common Processing28.9.2. Processing ARPOP_REQUEST Packets28.9.2.1. Passive learning and ARP optimization28.9.2.2. Requests with zero addresses28.9.3. Processing ARPOP_REPLY Packets28.9.4. Final Common Processing
28.10.1. Destination NAT (DNAT)28.10.2. Proxy ARP Server as Router
28.12.1. Received Events28.12.2. Generated Events28.12.3. Wake-on-LAN Events
28.13.1. Kernel Side28.13.2. User-Space Side
29.1. System Administration of Neighbors29.1.1. Common Routines29.1.2. New-Generation Tool: IPROUTE2's ip Command29.1.3. Old-Generation Tool: net-tools's arp Command
29.2.1. The /proc/sys/net/ipv4/neigh Directory29.2.1.1. Initialization of global and per-device directories29.2.1.2. Directory creation29.2.2. The /proc/sys/net/ipv4/conf Directory
29.3.1. neighbour Structure29.3.2. neigh_table Structure29.3.3. neigh_parms Structure29.3.4. neigh_ops Structure29.3.5. hh_cache Structure29.3.6. neigh_statistics Structure29.3.7. Data Structures Featured in This Part of the Book
30.1. Routers, Routes, and Routing Tables30.1.1. Nonrouting Multihomed Hosts30.1.2. Varieties of Routing Configurations30.1.3. Questions Answered in This Part of the Book
30.2.1. Scope30.2.1.1. Use of the scope30.2.2. Default Gateway30.2.3. Directed Broadcasts30.2.4. Primary and Secondary Addresses30.2.4.1. Old-generation configuration: aliasing interfaces30.2.4.2. Relationship between aliasing devices and primary/secondary status
30.3.1. Special Routes30.3.2. Route Types and Actions30.3.3. Routing Cache30.3.4. Routing Table Versus Routing Cache30.3.5. Routing Cache Garbage Collection30.3.5.1. Examples of events that can expire cache entries30.3.5.2. Examples of eligible cache victims
30.4.1. Longest Prefix Match
31.1. Concepts Behind Policy Routing31.1.1. Lookup with Policy Routing31.1.2. Routing Table Selection
31.2.1. Next Hop Selection31.2.2. Cache Support for Multipath31.2.2.1. Weighted random algorithm31.2.2.2. Device round-robin algorithm31.2.3. Per-Flow, Per-Connection, and Per-Packet Distribution31.2.3.1. Equalizer algorithm
31.3.1. Routing Table Based Classifier31.3.1.1. Configuring policy realms31.3.1.2. Configuring route realms31.3.1.3. Computing the routing tag31.3.2. Policy Routing and Firewall-Based Classifier
31.6.1. Shared Media31.6.2. Transmitting ICMP_REDIRECT Messages31.6.3. Processing Ingress ICMP_REDIRECT Messages
32.1. Kernel Options32.1.1. Basic Options32.1.2. Advanced Options32.1.3. Recently Dropped Options
32.2.1. Lists and Hash Tables
32.3.1. Route Scopes32.3.2. Address Scopes32.3.3. Relationship Between Route and Next-Hop Scopes
32.8.1. Helper Routines32.8.2. Changes in IP Configuration32.8.2.1. Adding an IP address32.8.2.2. Removing an IP address32.8.3. Changes in Device Status32.8.3.1. Impacts on the routing tables32.8.3.2. Impacts on the policy database32.8.3.3. Impacts on the IP configuration
32.9.1. Netlink Notifications32.9.2. Policy Routing and Firewall-Based Classifier32.9.3. Routing Protocol Daemons
33.1. Routing Cache Initialization
33.3.1. Cache Locking33.3.2. Cache Entry Allocation and Reference Counts33.3.3. Adding Elements to the Cache33.3.4. Binding the Route Cache to the ARP Cache33.3.5. Cache Lookup33.3.5.1. Ingress lookup33.3.5.2. Egress lookup
33.4.1. Registering a Caching Algorithm33.4.2. Interface Between the Routing Cache and Multipath33.4.3. Helper Routines33.4.4. Common Elements Between Algorithms33.4.5. Random Algorithm33.4.6. Weighted Random Algorithm33.4.7. Round-Robin Algorithm33.4.8. Device Round-Robin Algorithm
33.5.1. IPsec Transformations and the Use of dst_entry33.5.2. External Events
33.7.1. Synchronous Cleanup33.7.2. rt_garbage_collect Function33.7.3. Asynchronous Cleanup33.7.4. Expiration Criteria33.7.5. Deleting DST Entries33.7.6. Variables That Tune and Control Garbage Collection
34.1. Organization of Routing Hash Tables34.1.1. Organization of Per-Netmask Tables34.1.1.1. Basic structures for hash table organization34.1.1.2. Dynamic resizing of per-netmask hash tables34.1.2. Organization of fib_info Structures34.1.2.1. Dynamic resizing of global hash tables34.1.3. Organization of Next-Hop Router Structures34.1.4. The Two Default Routing Tables: ip_fib_main_table and ip_fib_local_table
34.3.1. Adding a Route34.3.2. Deleting a Route34.3.3. Garbage Collection
34.4.1. Variable and Structure Definitions34.4.2. Double Definitions for Functions
35.1. High-Level View of Lookup Functions
35.3.1. Semantic Matching on Subsidiary Criteria35.3.1.1. Criteria for rejecting routes35.3.1.2. Return value from fib_semantic_match
35.5.1. Initialization of Function Pointers for Ingress Traffic35.5.2. Initialization of Function Pointers for Egress Traffic35.5.3. Special Cases
35.7.1. Creation of a Cache Entry35.7.2. Preferred Source Address Selection35.7.3. Local Delivery35.7.4. Forwarding35.7.5. Routing Failure
35.8.1. Search Key Initialization35.8.2. Selecting the Source IP Address35.8.3. Local Delivery35.8.4. Transmission to Other Hosts35.8.5. Interaction Between Multipath and Default Gateway Selection35.8.6. Default Gateway Selection35.8.7. fn_hash_select_default Function
35.9.1. Multipath Caching
35.10.1. fib_lookup with Policy Routing35.10.2. Default Gateway Selection with Policy Routing
35.12.1. Storing the Realms35.12.2. Helper Routines35.12.3. Computing the Routing Tag
36.1. User-Space Configuration Tools36.1.1. Configuring Routing with IPROUTE236.1.1.1. Correspondence between IPROUTE2 user commands and kernel functions36.1.1.2. inet_rtm_newroute and inet_rtm_delroute functions36.1.2. Configuring Routing with net-tools36.1.3. Change Notifications36.1.4. Routes Inserted by the Kernel: The fib_magic Function
36.3.1. The /proc/sys/net/ipv4 Directory36.3.2. The /proc/sys/net/ipv4/route Directory36.3.3. The /proc/sys/net/ipv4/conf Directory36.3.3.1. Special subdirectories36.3.3.2. Use of the special subdirectories36.3.3.3. File descriptions36.3.4. The /proc/net and /proc/net/stat Directories
36.5.1. fib_table Structure36.5.2. fn_zone Structure36.5.3. fib_node Structure36.5.4. fib_alias Structure36.5.5. fib_info Structure36.5.6. fib_nh Structure36.5.7. fib_rule Structure36.5.8. fib_result Structure36.5.9. rtable Structure36.5.10. dst_entry Structure36.5.11. dst_ops Structure36.5.12. flowi Structure36.5.13. rt_cache_stat Structure36.5.14. ip_mp_alg_ops Structure

Content preview from Understanding Linux Network Internals

Chapter 4. Notification Chains

The kernel's many subsystems are heavily interdependent, so an event detected or generated by one of them could be of interest to others. To fulfill the need for interaction, Linux uses so-called notification chains .

In this chapter, we will see:

How notification chains are declared and what chains are defined by the networking code
How a kernel subsystem can register to a notification chain
How a kernel subsystem generates a notification on a chain

Note that notification chains are used only between kernel subsystems. Notifications between kernel and user space rely on other mechanisms, such as those introduced in Chapter 3.

Reasons for Notification Chains

Suppose we had the Linux router in Figure 4-1 with four interfaces. The figure shows the relationship between the router and five networks, along with a simplified version of its routing table.

Let's look at some examples of the topology in Figure 4-1. Network A is directly connected to RT on interface eth0, and network F is not directly connected to RT, but RT's eth3 is directly connected to another router that has an interface with address IP1, and that second router knows how to reach network F. The other cases are similar. In short, some networks are directly connected and others require the help of one or more additional routers to be reached.

For a detailed description of how the routing code handles this situation, refer to Part VII. In this chapter, we will concentrate on the role of notification ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 0596002556Errata Page

Understanding Linux Network Internals

by Christian Benvenuti

Chapter 4. Notification Chains

Reasons for Notification Chains

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Linux Fundamentals

Mastering Linux Kernel Development

Linux Kernel Debugging

Linux Observability with BPF

Publisher Resources