Graceful Routing Engine Switchover (GRES) takes advantage of the separation between the control and forwarding in JUNOS software to provide system redundancy. GRES allows the control plane, in this case the RE, to switch over to its backup RE without any interruption to the existing traffic flows in the PFE. When GRES is configured on the router, the kernel state on both REs is synchronized to preserve routing and forwarding state on both REs. Any changes to the routing state on the primary RE result in an automatic incremental update of kernel state on the backup RE. As you can see, the GRES concept is very simple.
However, the limitation of GRES is that, by itself, it cannot provide router redundancy. Even though traffic continues to flow through the router during a switchover between REs, the flow occurs for only a limited time. As soon as any of the protocol timers expire, the neighbor relationship between routers is dropped and traffic is stopped at the upstream router. To maintain high availability, networks must quickly discover when a neighbor goes down and must implement the lowest possible timers. Because GRES provides an intact forwarding plane, it is to our advantage not to drop any of the adjacencies, but rather to continue sending traffic toward the failed router.
The solution to this limitation is to use the Graceful Restart (GR) protocol extension. GR signals all supporting protocols that the failing router is capable of forwarding even though it is having control plane problems and needs help maintaining protocol adjacencies. GRES provides zero loss only when supplemented with the GR protocol running between a failed router and all its neighbors. For more about GR, see Graceful Restart.
Note
As we will see later in this chapter, the key factor for using GR is to have a stable topology. Any topology change results in adjacency loss.
Enabling GRES on the router changes the information flow from the RE to the PFE. By default, without GRES enabled, the RE signals any state change or routing update directly to the PFE. However, when GRES is enabled, these changes must first be duplicated on the backup RE. This action must be taken first to avoid any potential corner cases where the PFE and the backup RE might be out of sync. For example, if the backup RE is updated last and the primary RE crashes after it updates the PFE, but before updating the backup RE, the backup RE and the PFE would be out of sync. The problems that would result would be immense, so the replication order used makes sure that this condition never happens.
Because state replication is a broad term, let’s examine further what exactly is being replicated. From the user perspective, all the interfaces and their states are replicated when the interfaces restart. Additionally, all Layer 2 protocol states are preserved, such as ATM, Frame Relay, and Point to Point Protocol (PPP) states. All routes from the kernel table, as well as Address Resolution Protocol (ARP) entries and firewall states, are preserved as well. However, TCP state and actual RPD routes are not preserved because they are rebuilt either locally by means of a kernel copy or through new neighbor discovery.
From the JUNOS perspective, three states are being replicated. Understanding them will help you to configure and troubleshoot the state replication process used in GRES. The three states are:
- Configuration database
The configuration database is the repository of the router’s configuration files. Different daemons query this database as needed. The
dcd
daemon, which manages interfaces, queries it as an interface is brought online. At the same time, thechassisd
daemon uses this database when it manages hardware components. The RPD also uses this database, using the configuration information stored in the database to control all routing protocols. To ensure that this state is preserved and the database is always in sync, you must use thecommit synchronize
command when committing a change in a redundant RE configuration. JUNOS displays an error prompt when GRES is enabled but you fail to commit configuration changes with acommit synchronize
command.- Kernel with all its entries
Configuring GRES starts the
ksyncd
daemon, which is responsible for all kernel state replications.ksyncd
is a custom JUNOS daemon used only for replication tasks between different hardware components. Here, it is used to replicate the kernel state.ksyncd
uses regular IPCRtsock
messages to carry information from the kernel on the primary RE to the kernel on the backup RE. When a GRES restart event occurs, the RPD starts up on the backup RE and it reads all saved routes from the kernel and puts them into routing tables as kernel routes (KRT
). These routes stay active for a maximum of three minutes. Any routing changes resulting in routing updates, even those as simple as ARP entries, are signaled incrementally to the backup RE by means ofksyncd
.- PFE state
PFE state replication is done by
chassisd
. When a GRES restart event occurs,chassisd
does a soft restart of all the hardware, querying it for an inventory. The hardware that responds is reattached to the backup RE and brought online without any disruption. However, if certain hardware fails to respond, it is restarted anyway. To make the replication process as efficient as possible, local user files, accounting information, logs, andtraceoptions
files are not replicated to the backup RE.
Figure 4-2 illustrates the state replication components and the flow of communication between them.
While the replication of all states is important, let’s look
further at the role of kernel state replication, because all routing
entries, including ARP-derived next hop addresses, depend on successful
replication of the kernel state. Once the user makes a change and
implements it with the commit sync
command, all the state replication takes place. All routing and next hop
entries derived from the RPD and found in routing tables inet.0
,
inet.6
, mpls.0
, and inet.3
are replicated from the primary RE’s
kernel routing table to the secondary RE’s kernel table. All these
routes become “active” routes with duplicate forwarding entries in the
PFE. The routes and forwarding entries stay active for
about three minutes and then are purged. Once the RPD on the backup RE
initializes itself, it populates its routing table with the existing
kernel routes. We can then say that the GRES event has successfully
completed. From a high availability perspective, the RPD acquires the
most up-to-date network state information and refreshes its routing
tables. Figure 4-3 illustrates
the GRES state replication process.
GRES is supported only on the routers with two REs running the
same JUNOS version. Use the show
version
command to verify the software version on the RE; by
default, when you log in, you are on the master RE:
lab@r1> show version
Hostname: r1
Model: mx480
JUNOS Base OS boot [9.2R2.15]
JUNOS Base OS Software Suite [9.2R2.15]
JUNOS Kernel Software Suite [9.2R2.15]
JUNOS Crypto Software Suite [9.2R2.15]
JUNOS Packet Forwarding Engine Support (M/T Common) [9.2R2.15]
JUNOS Packet Forwarding Engine Support (MX Common) [9.2R2.15]
JUNOS Online Documentation [9.2R2.15]
JUNOS Routing Software Suite [9.2R2.15]
From the master RE, you can log in to the backup RE and verify its software version:
lab@r1> request routing-engine login other-routing-engine
--- JUNOS 9.2R2.15 built 2008-10-03 19:32:58 UTC
lab@r1> show version
Hostname: r1
Model: mx480
JUNOS Base OS boot [9.2R2.15]
JUNOS Base OS Software Suite [9.2R2.15]
JUNOS Kernel Software Suite [9.2R2.15]
JUNOS Crypto Software Suite [9.2R2.15]
JUNOS Packet Forwarding Engine Support (M/T Common) [9.2R2.15]
JUNOS Packet Forwarding Engine Support (MX Common) [9.2R2.15]
JUNOS Online Documentation [9.2R2.15]
JUNOS Routing Software Suite [9.2R2.15]
If the systems are running the same JUNOS version, you can
configure GRES using the redundancy
statements under the chassis hierarchy:
[edit]lab@r1# set chassis redundancy graceful-switchover
[edit]lab@r1# set chassis redundancy failover on-disk-failure
[edit]lab@r1# set chassis redundancy failover on-loss-of-keepalives
[edit] lab@r1#set chassis redundancy routing-engine 0 master
[edit] lab@r1#set chassis redundancy routing-engine 1 backup
[edit] lab@r1#show chassis redundancy
routing-engine 0 master; routing-engine 1 backup; failover { on-loss-of-keepalives; on-disk-failure; } graceful-switchover;
RE failure is one type of failure that GRES monitors. The primary RE sends keepalive messages every second to the backup kernels. If the backup RE does not receive keepalives for two consecutive seconds, it presumes that the primary RE has failed and attempts to acquire mastership by starting the RE failover process. Another type of failure handled by GRES is media corruption. The kernel is intelligent enough to recognize storage media problems, both with the hard drive and with the CompactFlash drive. Because JUNOS software is built on a modified FreeBSD kernel, it requires constant writing to the logfiles. When the kernel senses problems with writing or reading from the hard drive, it registers this as an RE failure and starts up a GRES event.
Note
In newer JUNOS releases, another configuration step is required to enable GRES: configuring the backup router. Although the backup router is not part of the redundancy mechanism, it is required so that the control plane is always reachable when the primary router is loading or recovering a configuration, or when this router is being configured. The backup router serves as a default route for the control plane, even though this route is never installed into the forwarding plane. Here is an example of configuring the backup router:
lab@re1# set system backup-router destination 184.12.1.12
Once GRES is enabled, verify its status by executing
the show system
switchover
command from the backup RE:
{backup}
lab@r1> show system switchover
Graceful switchover: On
Configuration database: Ready
Kernel database: Version incompatible
Peer state: Out of transition
The preceding output shows that the router is running different JUNOS versions on the primary and backup REs. Once you upgrade JUNOS on the backup RE, verify GRES again:
{backup}
lab@r1> show system switchover
Graceful switchover: On
Configuration database: Ready
Kernel database: Ready
Peer state: Steady State
As the previous section shows, the steps to implement GRES are simple. However, there wouldn’t be such a high demand for skilled IT employees if everything always worked as configured and expected. Sometimes it will be necessary to troubleshoot how the protocols are working in your network.
As an example, let’s consider a scenario in which the entire configuration is in place and the kernel replication states look fine, but during a lab RE switchover test you observe traffic loss. Because GRES is enabled, no packets should be dropped—at least, that’s what Juniper promised us. Well, let’s analyze the situation and see what is happening.
To troubleshoot the actual problem, first verify the
configuration. Enabling traceoptions
on all the protocols and
GRES knobs helps us get to the root of the problem:
lab@r1#show protocols ospf
traceoptions { file ospf.trace; flag all detail; } [edit] lab@r1#show protocols bgp
traceoptions { file bgp.trace; flag all detail; }
Note
It is not recommended to have flag all
turned on under any protocol traceoptions
in the
production network. Logging all protocol events takes a toll on RPD
processing, which could jeopardize real-time protocol managements
such as BGP keepalives or OSPF Hellos. flag all
was acceptable in this case since it is a lab environment.
Then check the traceoptions
output with the
show log
command. If you see from the logs that the
protocols are working correctly and that it was not a protocol error
that caused the issue, the next step is to debug the actual GRES
communication:
[edit] lab@r1#show chassis
redundancy { routing-engine 0 master; routing-engine 1 backup; failover { on-loss-of-keepalives; on-disk-failure; } graceful-switchover { traceoptions { flag all; } enable; } } [edit] lab@r1#show routing-options
static { route 66.129.243.0/24 next-hop 172.18.66.1; } forwarding-table { traceoptions { flag all detail; } }
When you enable traceoptions
in the GRES portion of the chassis redundancy configuration and on the
forwarding table, JUNOS software begins placing debugging information
about redundancy and the forwarding engine into logfiles, which you
can then review.
Because ksyncd
is the daemon responsible for kernel route replication, its
logs usually reveal a lot of information about state, potential
errors, misconfigurations, and software bugs. Check the logs with the
following command:
lab@r1> show log ksyncd
The logs themselves contain much information that can point you
in the proper direction when troubleshooting. The following code
snippet shows log output from ksyncd
:
lab@r1> show log ksyncd
Sep 10 23:27:09 Terminated: 15 signal received, posting signal
Sep 10 23:27:09 inspecting pending signals
Sep 10 23:27:09 SIGTERM posted, exiting
Sep 12 17:12:27 KSYNCD release 9.0R3.6 built by builder on 2008-08-01 05:05:43
UTC starting, pid 4564
Sep 12 17:12:27 Not runnable attempt 0 reason: not configured (errors: none)
Sep 12 17:12:27 Setting hw.re.is_slave_peer_gres_ready: 0
Near the end of the following output, an error message points to
a configuration error, specifically that the commit sync
command is not configured on the
master RE:
lab@r1> show log ksyncd
Sep 15 21:57:52 KSYNCD release 9.1R2.10 built by builder on 2008-07-01
05:06:40 UTC starting, pid 4566
Sep 15 21:57:52 Not runnable attempt 0 reason: undefined mode (errors: none)
Sep 15 21:58:01 Terminated: 15 signal received, posting signal
Sep 15 21:58:01 inspecting pending signals
Sep 15 21:58:01 SIGTERM posted, exiting
Sep 18 02:17:16 KSYNCD release 9.1R2.10 built by builder on 2008-07-01 05:06:40
UTC starting, pid 8998
Sep 18 02:17:16 Commit sync knob is NOT configured on master RE
Sep 18 02:17:16 Stop attempting to perform initial sync
Sep 18 02:17:16 config state: ready
Another problem that occurs quite often is that the REs on the
router are running different versions of JUNOS software. In the
kysncd
logfile, you see this as
a version mismatch error. You could
easily search for this message when parsing the logfiles:
lab@r1> show log ksyncd
Sep 18 02:17:17 Register timer to wait for version info from master
Sep 18 02:17:17 recv RE msg subtype RE_MSG_RTSOCK_VERSION_REPLY
Sep 18 02:17:17 RE_MSG_RTSOCK_VERSION_REPLY :
Sep 18 02:17:17 rtm_n_msg_types : 0x00000059
Sep 18 02:17:17 rtm_version_checksum : 0x86496873
Sep 18 02:17:17 Received RTSOCK version message from master
Sep 18 02:17:17 Rtsock version checksum mismatch: master 2252957811, slave 1789492047
Sep 18 02:17:17 Version mismatch detected: Slumber time
Sep 18 02:17:17 Suspending due to unrecoverable error: version_mismatch
Sep 18 02:17:17 Not runnable attempt 0 reason: hard error (errors: version_mismatch )
Sep 18 02:17:17 closing connection to master
Sep 18 02:17:17 cleaning up kernel state
Sep 18 02:17:17 delete all commit proposals seqno 1658:
Sep 18 02:17:17 delete route rtb 0 af 2 rttype perm: skip not supported
Sep 18 02:17:17 delete route rtb 0 af 2 0.0.0.0 rttype perm: skip not supported
Sep 18 02:17:17 delete route rtb 0 af 2 66.129.243.0 rttype user: skip private
Sometimes issues may not be obvious in the ksyncd
logs, or the log may not reveal
anything specific. A further step in troubleshooting ksyncd
is to log in to the backup RE and run
the Unix command rtsockmon
from the
shell to view the actual route replication process:
lab@r1>start shell
% rtsockmon -t sender flag type op [20:07:40] rpd P route add inet 4.4.4.4 tid=0 plen=32 type=user flags=0x10nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.95.208 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.95.204 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.231.248 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0 [20:07:40] rpd P route add inet 172.24.95.196 tid=0 plen=30 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0[20:07:40] rpd P route add inet 3.3.3.3 tid=0
plen=32 type=user flags=0x10 nh=indr nhflags=0x4 nhidx=262142 filtidx=0
This output shows that the route replication process seems to be
fine. If the configuration and
logfiles do not reveal anything out of the ordinary, you next look at
the bigger picture, such as networkwide issues. For example,
investigate the protocol traceoptions
logs to see whether an
adjacency dropped during a GRES event on a failed router. If this
appears to be the case, analyze the neighbor’s configuration for
Graceful Restart extensions. If you see very similar config without a
Graceful Restart knob, the neighbor is missing the GR configuration:
lab@r1# show routing-options
static {
route 66.129.243.0/24 next-hop 172.18.66.1;
As you will see in the next section, the GRES concept alone is not sufficient to keep the network running during RE failures. GRES must be complemented with Graceful Restart.
Get JUNOS High Availability now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.