To utilize the robust high availability toolkit provided in JUNOS, one must fully understand the software components of the RE and how they work together to build a highly available operating system. As we discussed in Chapter 3, JUNOS provides a clear separation between the forwarding and control planes. This separation creates an environment in which the router can still forward the traffic when its control plane is down. As long as the traffic is actually flowing through the router, users do not experience any network-related issues.
The RE is the brain that stores the building blocks of system availability, providing all the necessary tools for routing protocols and route calculations. The main function of the RE is to perform route management, using a vastly modified Unix Routing Protocol Daemon (RPD). Because route management is a complex function, the RPD divides its work into many tasks and runs its own scheduler to prioritize them, ensuring that each protocol and route calculation receives the appropriate resources to perform its job.
The primary goal of the RPD is to create and maintain the Routing Information Base (RIB), which is a database of routing entries. Each routing entry consists of a destination address and some form of next hop information. RPD maintains the routing table and properly distributes routes from the routing table into the kernel and the hardware complexes used for traffic forwarding.
While almost all network equipment vendors use the concept of a RIB for Border Gateway Protocol (BGP), JUNOS uses a RIB-based structure for all of its routing tables. To understand routing for high availability in your network, it is important to know the table names and to understand the role of each table. Table 4-1 describes the JUNOS routing tables.
Table 4-1. Routing tables implemented in JUNOS
The RPD stores routes in these tables and moves routes among the
tables as needed. For example, when the router receives routing
information from a routing protocol in the form of newly advertised
routes, such as a BGP update message, the routing update is stored in the
table called RIB-IN
. The RPD runs BGP
import policies and the BGP best route selection algorithm on the received
routes to create an ordered set of usable routes. The final results of the
route selection process are stored in the routing main JUNOS
RIB, inet.0
. As BGP prepares
to advertise BGP routes to its peers, the export policy is run against the
routes and the results are moved into the outgoing table, RIB-OUT
.
The RPD stores routes for BGP-based Layer 3 VPNs in the
table bgp.l3vpn.0
, which is
populated by Multiprotocol BGP (MP-BGP). As JUNOS software runs the
configured policies against the information in the table, all acceptable
routes are sent to one or more routing-instance tables while any routing
information that is unacceptable to the policies is marked as hidden.
After the RPD route selection process is finalized, the RPD daemon copies
the selected routes into the kernel’s copy of the routing table using IPC
messages. JUNOS does not rely on BSD’s default routing socket for
IPC; instead, it uses a specialized socket that allows any daemon within
the box to communicate with the kernel. For example, the RPD uses routing
sockets to signal the addition, deletion, or change of routes to the
kernel. Similarly, the dcd
daemon, responsible
for interface management, also communicates with the kernel using the same
routing socket type when it signals an interface addition, deletion, or
change of its status. And again, the chassisd
daemon updates
the kernel with any new or changed hardware status using the same routing
socket type.
The protocol used for this IPC is Trivial Network Protocol (TNP). TNP is a Layer 3 protocol (like an IP) and uses
Ethernet II encapsulation. Like any Layer 3 protocol, it
uses source and destination addresses, and it forms and maintains neighbor
relationships using Hello messages. The TNP Hello message contains Hello timers and dead intervals,
which are used to discover the failure of other hardware components (REs, Packet Forwarding Engines or PFEs, and Flexible PIC Concentrators
or FPCs). While you cannot configure most of the TNP Hello
timers, you can configure the Hello and dead time intervals between two
REs through the command-line interface (CLI) keepalive
command.
Note
Although the netstat
commands
do work in JUNOS, because JUNOS uses a raw socket type, the IPC is not
visible with the plain Unix netstat–a
command. Communication is formed using rtsock
messages and can be viewed using the
rtsockmon
command.
Continuing our look at the RE’s processes, the next step is that the
kernel copies the RPD’s forwarding table to the PFE. The table structure
is modified so that it can be stored in the PFE’s application-specific
integrated circuits (ASICs), which make forwarding decisions. These
decisions are based on proprietary Radix tree route lookups (called J-Tree
route lookups) that are performed on each ASIC. As inet.0
and
subsequent forwarding table routes are modified, the RE kernel
incrementally updates the copy stored in the
PFE that is used for forwarding. Because the microkernel of each ASIC contains all routes and their
respective next hops, the actual forwarding process continues even when
the primary RE is brought down, as shown in Figure 4-1. The fact that
forwarding can continue while the control plane is unavailable—such as
during an RE switchover—is important for understanding high availability
solutions.
To understand how you can use the routing features of JUNOS high availability in your network, it is best to actually visualize the routing update process that we discussed in the previous section. The following steps and code snippets explain the control plane and forwarding plane interactions, as they also serve as an excellent troubleshooting tool for diagnosing issues that might occur during network degradation.
During normal operation, one of the REs should be online and labeled as master. The master RE should have open, active connections with the rest of the hardware components. The following commands verify the state of the hardware, as seen by the RE.
lab@r1>show chassis routing-engine
Routing Engine status: Slot 0: Current state Master Election priority Master (default) ...... Uptime 2 days, 1 hour, 55 minutes, 53 seconds Load averages: 1 minute 5 minute 15 minute 0.00 0.00 0.00 Routing Engine status: Slot 1: Current state Backup Election priority Backup (default) ...... Start time 2008-09-17 05:59:19 UTC Uptime 17 hours, 53 minutes, 38 seconds lab@r1>show chassis fpc
Temp CPU Utilization (%) Memory Utilization (%) Slot State (C) Total Interrupt DRAM (MB) Heap Buffer 0 Online 21 7 0 1024 24 31 1 Online 21 4 0 1024 24 31 2 Online 22 5 0 1024 18 31 3 Online 22 5 0 1024 17 31 4 Empty 5 Empty
All hardware components should now be online, and one of the REs should be listed in the Master state. If any hardware component is not online, you can start troubleshooting by examining the IPC performed by TNP between different hardware components. Specifically, look at the connections between the RE and the PFE and make sure you see an RPD OPEN state for each online PFE component:
lab@r1>start shell
lab@r1%netstat -a -f tnp
Active TNP connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address (state) <...> rdp 0 0 master.pfed feb0.46081 OPEN rdp 0 0 master.chassisd feb0.46080 OPEN rdp 0 0 master.pfed fpc2.24577 OPEN rdp 0 0 master.chassisd fpc2.24576 OPEN udp 0 0 *.sampled *.* udp 0 0 *.sampled *.* rdp 0 0 *.1013 *.* LISTEN rdp 0 0 *.chassisd *.* LISTEN
If you see state issues in step 2, research them further by monitoring the internal management interface:
lab@r1> monitor traffic interface em1
verbose output suppressed, use <detail> or <extensive> for full protocol decode
Listening on em1, capture size 96 bytes
02:51:40.239754 Out TNPv2 master.1021 > re1.1021: UDP, length 8
02:51:40.397159 In TNPv2 re1.1021 > re0.1021: UDP, length 8
02:51:41.249676 Out TNPv2 master.1021 > re1.1021: UDP, length 8
02:51:41.407092 In TNPv2 re1.1021 > re0.1021: UDP, length 8
02:51:42.259578 Out TNPv2 master.1021 > re1.1021: UDP, length 8
02:51:42.416900 In TNPv2 re1.1021 > re0.1021: UDP, length 8
02:51:43.269506 Out TNPv2 master.1021 > re1.1021: UDP, length 8
02:51:43.426834 In TNPv2 re1.1021 > re0.1021: UDP, length 8
Once the RE is online, if you configured a BGP neighbor, you next verify the state of the BGP adjacency:
lab@r1> show bgp summary
Groups: 1 Peers: 1 Down peers: 0
Table Tot Paths Act Paths Suppressed History Damp State Pending
inet.0 28 28 0 0 0 0
bgp.l3vpn.0 13 13 0 0 0 0
bgp.mvpn.0 5 5 0 0 0 0
Peer AS InPkt OutPkt OutQ Flaps Last Up/Dwn State|#Activ
e/Received/Damped...
69.191.3.199 33181 68 24 0 0 8:21 Establ
inet.0: 28/28/0
bgp.l3vpn.0: 13/13/0
bgp.mvpn.0: 5/5/0
vpn.inet.0: 13/13/0
vpn.mvpn.0: 5/5/0
Next, verify that the route updates are being received. The following command
peeks into the BGP RIB-IN
table:
lab@r1> show route receive-protocol bgp 69.191.3.199
inet.0: 64 destinations, 64 routes (63 active, 0 holddown, 1 hidden)
Prefix Nexthop MED Lclpref AS path
* 3.3.3.3/32 69.191.3.201 2000 100 13908 I
* 4.4.4.4/32 69.191.3.201 2000 100 13908 I
* 69.184.0.64/26 69.191.3.201 2025 100 13908 I
* 69.184.25.64/28 69.191.3.201 100 13908 I
* 69.184.25.80/28 69.191.3.201 100 13908 I
* 69.184.25.96/28 69.191.3.201 100 13908 I
* 101.0.0.0/30 69.191.3.201 100 13908 ?
* 128.23.224.4/30 69.191.3.201 100 13908 ?
... output truncated...
The following CLI output proves that a route has gone through the BGP selection process and has been marked as active:
lab@r1>show route 3.3.3.3
inet.0: 64 destinations, 64 routes (63 active, 0 holddown, 1 hidden) @ = Routing Use Only, # = Forwarding Use Only + = Active Route, - = Last Active, * = Both3.3.3.3/32
*[BGP/170] 1w6d 21:55:50, MED 2000, localpref 100, from 69.191.3.199 AS path: 13908 I > to 172.24.160.1 via fe-1/3/1.0
The following output gives more details about the actual BGP selection process, including the reasons the route was activated or deactivated, the BGP next hop, the physical next hop, and the state of the route:
lab@r1>show route 3.3.3.3 extensive
inet.0: 64 destinations, 64 routes (63 active, 0 holddown, 1 hidden) 3.3.3.3/32 (1 entry, 1 announced) TSI:KRT in-kernel 3.3.3.3/32 ->
{indirect(262148)} *BGP Preference: 170/-101 Next hop type: Indirect Next-hop reference count: 75 Source: 69.191.3.199 Next hop type: Router, Next hop index: 462 Next hop: 172.24.160.1 via fe-1/3/1.0, selected Protocol next hop: 69.191.3.201 Indirect next hop: 89bb000 262148State: <Active Int
Ext> Local AS: 33181 Peer AS: 33181 Age: 1w6d 21:54:56 Metric: 2000 Metric2: 1001 Announcement bits (2): 0-KRT 7-Resolve tree 2 Task: BGP_33181.69.191.3.199+179 AS path: 13908 I (Originator) Cluster list: 69.191.3.199 AS path: Originator ID: 69.191.3.201 Communities: 13908:5004 Localpref: 100 Router ID: 69.191.3.199 Indirect next hops: 1 Protocol next hop: 69.191.3.201 Metric: 1001 Indirect next hop: 89bb000 262148 Indirect path forwarding next hops: 1 Next hop type: Router Next hop: 172.24.160.1 via fe-1/3/1.0 69.191.3.201/32 Originating RIB: inet.0 Metric: 1001 Node path count: 1 Forwarding nexthops: 1 Nexthop: 172.24.160.1 via fe-1/3/1.0
This output shows the kernel copy of the forwarding table. This table, which is present on the RE, is sent to the PFE complex by means of routing update messages:
lab@r1> show route forwarding-table destination 3.3.3.3 extensive
Routing table: inet [Index 0]
Internet:
Destination: 3.3.3.3/32
Route type: user
Route reference: 0 Route interface-index: 0
Flags: sent to PFE, prefix load balance
Next-hop type: indirect Index: 262148 Reference: 26
Nexthop: 172.24.160.1
Next-hop type: unicast Index: 462 Reference: 47
Next-hop interface: fe-1/3/1.0
This step looks at the rtsock
messages being used to replicate the kernel table into the PFE complex:
lab@r1> start shell % rtsockmon -t sender flag type op [20:07:40] rpd P nexthop add inet 172.24.160.1 nh=indr flags=0x1 idx=262142 ifidx=68 filteridx=0 [20:07:40] rpd P route add inet 69.184.0.64 tid=0 plen=26 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 199.105.185.224 tid=0 plen=28 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 160.43.3.144 tid=0 plen=28 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.231.252 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 4.4.4.4 tid=0 plen=32 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.95.208 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.95.204 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.231.248 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.95.196 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 3.3.3.3 tid=0 plen=32 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 160.43.175.0 tid=0 plen=27 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P nexthop add inet 172.24.160.1 nh=ucst flags=0x85 idx=494 ifidx=68 filteridx=0
Using a VTY session to access the PFE complex, you can determine which actual forwarding entries are present in the PFE’s ASICs. The following output shows that the routing entry for destination 3.3.3.3 is present and has a valid next hop:
lab@r1>start shell
% su Password: root@r1% vty feb CSBR platform (266Mhz PPC 603e processor, 128MB memory, 512KB flash) CSBR0(r1 vty)# show route ip prefix3.3.3.3
detail IPv4 Route Table 0, default.0, 0x0: Destination NH IP Addr Type NH ID Interface ------------------------- --------------- -------- ----- --------- 3.3.3.3 172.24.160.1 Indirect 262142 fe-1/3/1.0 RT flags: 0x0010, Ignore: 0x00000000, COS index: 0, DCU id: 0, SCU id: 0 RPF ifl list id: 0, RPF tree: 0x00000000 PDP[0]: 0x00000000 Second NH[0]: 0x00000000
The exercise in this section verified the state of the control plane and followed the routing update process from the initial establishment of a BGP peering session all the way to installing the routing entry into the forwarding ASIC on the PFE. Now that you understand the control and forwarding plane processes and their interactions, we can discuss the different high availability solutions available through JUNOS software.
Get JUNOS High Availability now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.