Welcome to the world of automating Junos management! Since its introduction in the late 1990s, the user interface (UI) of the Junos software has set it apart from its competitors by making it easy for network operators to manage their devices using the command-line interface (CLI). In addition, Juniper has been a leader in network automation, shipping an API in the first Junos release and delivering the first external API in Junos 4.1.
However, times have changed. More and more, as operators look to automate their networks, other management interfaces are growing in importance. Juniper has kept up with this trend and is striving to be an industry leader by enabling automation to work with its devices.
This chapter sets the stage for the rest of the book by discussing the benefits of automation, reviewing some background information about the way the Junos management system works, and giving some basic information about the book.
Because you are reading this book, you probably already know at least some of the benefits of automation. However, we find that whenever you review the possibilities of automation, you may find a new way to use it beyond the ones you were planning. Let’s review some common benefits.
For example, assume that you have a standard methodology for
troubleshooting failed Border Gateway Protocol (BGP) sessions. First, you check the
interfaces output for the interface connected to the peer. After that, you ping the
peer. Then, you look at the
show bgp neighbor
output for the peer. Finally, you look for log errors related to the
peer or its interface.
It is completely acceptable to maintain a list of the appropriate commands and run them each time you need to troubleshoot a failed BGP session. However, you may be able to save time by reducing these to an automation script that runs the appropriate commands.
In fact, you may be able to save even more time by having the
script actually interpret the command output for you. For example, what
do you really want to know from the
interfaces output? You probably want to see whether the
interface is up or down. You probably also want to look for unusual
statistics (such as a very high data rate or a high error rate).
Depending on what you see here, you may already know the reason for the
session failure. (If the link is down, it isn’t necessary to look any
further: no traffic will reach the peer.) By having the script look for
obvious clues like this, you may be able to write a script that simply
tells you why the BGP session is down when it is
obvious. (That automated analysis can be a big help when you get a
4 a.m. call to troubleshoot a
But why stop there? Why should you even need to run the script? Why not have your device automatically run the script for you whenever a peering session goes down and stays down for more than five minutes? Junos has the automation hooks to enable this sort of event-driven script execution.
And, of course, one of the big ways automation can save time is by simplifying the repetitive provisioning process, which often amounts to the use of standard fill-in-the-blanks templates. We’ll talk more about that in “Automation Prevents Copy/Paste Errors”.
For example, let’s say that your network core uses Multiprotocol Label Switching (MPLS) to forward traffic. Further, let’s assume MPLS is required because you have a large number of applications (such as virtual private networks [VPNs]) that require MPLS in order to operate correctly. Now imagine that someone provisions new network interconnects between core and edge routers, but forgets to enable MPLS processing for those interfaces on the core router. That scenario is a network outage waiting to happen! Once the other paths between the core and edge routers go down, MPLS traffic will be unable to flow over these new links.
Wouldn’t it be great if you could prevent someone from doing this by programming the network to know the expected configurations and catch omissions like this? Again, Junos has the automation hooks to enable this protection.
We all probably have complicated tasks that we only need to perform once in a while. It is not uncommon for me to answer a colleague’s question by scratching my head and saying, “I remember looking up the command to do that. Now, just give me a minute to find it.”
It is certainly the case that not every command should be reduced to automation. But things that are particularly critical and are typically needed in time-sensitive situations are good candidates for automation. Likewise, things that are particularly complicated are also candidates for automation.
A number of common network management tasks can be reduced to following a template with variables that need to be completed. Perhaps the paradigmatic example of this is network provisioning (although the same concept can apply in other contexts, such as troubleshooting).
Network provisioning often involves templates: a template for the base device configuration, a template for internal connections, a template for transit or peer connections, and multiple templates for different types of customer connections. These templates usually have very few variables that an operator needs to change in order to use them. However, it is easy for someone to accidentally omit an important line, forget to change one of the variables, or change one of the variables to an invalid value. And even if the user makes no mistakes while creating the configuration from the template, sometimes an error is introduced during the process of copying and pasting a large configuration block or a large set of commands.
We’ll talk more about provisioning templates when we talk about commit scripts in Chapter 5. Commit scripts are one way to easily apply a template. Commit scripts even provide an optional way to reduce the size of the Junos configuration file by having Junos display the template parameters (rather than their full expansion) by default.
In this instance, automation can both save time and reduce the number of unintended errors. If the provisioning process is reduced to a simple script that asks for values for the few variables that are in a template, the script can make sure that the entire configuration is applied. The script can even ask the questions necessary to choose the correct template. In addition, the script can perform checks to validate that the values provided by the user make sense.
In fact, it is possible to even further reduce the information that a user needs to supply by having the script gather the information in the first place. Let’s assume that the customer data is all maintained in a SQL database. When prompted by the user, the script can query the database to gather the correct values for all the variables and present them to the user for validation. This further reduces the possibility of inadvertent errors.
From there, it is only a small step to fully automating the process. When a new customer is added to the database, an automated process can automatically activate the changes in the network.
It will probably not be possible to completely eliminate the possibility of someone entering incorrect information at some point in the process. However, using automation can reduce the number of places where incorrect information can be introduced. And it can promote consistency among the different systems that maintain information about the network by ensuring that the same information is used everywhere (whether that information is correct or incorrect).
SDN can be a good example of an external service enabled by automation. Imagine that a network operator wants to optimize their traffic flow every five minutes based on a complex set of algorithms, but they want to conduct the recalculation faster than that if certain events occur. This is the kind of complex network service that requires automation to effectively implement it.
An example of an internal service enabled by automation is an automated troubleshooting service. Imagine that a network operator implements an automated troubleshooting service that monitors for network events that may indicate a network error. The automation tool then responds to those events according to a set troubleshooting template. If it finds a problem that it can automatically correct, it does so. Otherwise, it sends the output of its findings to the network operator’s trouble ticket system. That output should have all the information necessary to continue troubleshooting the problem. This has the potential to reduce the amount of time that network engineers spend gathering basic information and trying “simple” fixes. It also has the potential to suppress false errors, reducing the amount of time that network engineers spend responding to false alerts and allowing them to start working on the real problems more quickly. Finally, it has the potential to reduce the time to resolution for network problems.
It is helpful to have a good understanding of the way “management data” usually flows through the Junos system. Some examples of the types of management data that may flow through a system include configuration information, operational commands, and statistics. Having a good understanding of this management data flow will help you understand the distinctions between various methods of accessing this data.
As we look at the flow of management data, you will see that the management daemon (MGD) is a central hub of activity. Most management data flows through MGD. The Junos software accepts management connections through a variety of mechanisms; however, most eventually turn into Junoscript or NETCONF sessions that connect to MGD. MGD has three primary mechanisms for interacting with daemons: a management socket (which it uses to pass along operational command requests and responses), the shared configuration database, and Unix signals.
Additionally, a user can interact with the Junos software using the REST API. As described in “Internal Design”, these sessions are piped through some extra plumbing, but eventually reach MGD.
Finally, it is possible for a user to interact with the Junos software using PyEZ, or to have op, commit, or event scripts launch remote procedure calls (RPCs). In all of these cases, a Junoscript or NETCONF session is used to connect to the software. Again, these Junoscript or NETCONF sessions all terminate with MGD.
Figure 1-1 illustrates how these various connection mechanisms all eventually wind up as sessions connected to MGD.
Operational commands arrive at MGD over one of the connection methods just described. When MGD receives the operational commands, it can either satisfy them itself, pass them along to other daemons to satisfy the requests, or invoke other tools to satisfy the requests.
For example, the get-authorization-information RPC (which is equivalent to the
authorization CLI command) is a prototypical example of the
kind of request MGD fulfills itself. With this command, the user is
asking for information about her authorization level. This is
information that MGD maintains for each session, and it is easy for MGD
to satisfy this request itself. (Another prototypical example in this
category is the get-configuration
RPC, or the
show configuration CLI
command. Again, this request is asking for information
that MGD maintains internally.)
Two classic examples of the kinds of operational commands that are passed on to a
daemon are the get-route-information
RPC (equivalent to the
show route CLI
command) and the clear-bgp-neighbor RPC
(equivalent to the
clear bgp neighbor
CLI command). In both cases, the routing protocol daemon (RPD)
is the daemon that satisfies this request. It is the
daemon that maintains information on the routing table (also known as
the routing information base, or RIB); therefore, it is the
daemon that can authoritatively answer a request for route information.
Likewise, RPD is the daemon that maintains BGP neighbor relationships;
therefore, it is the daemon that handles a request to reset one of those
relationships. In these cases, MGD serves as a two-way pipe, passing the
request from the user to RPD and passing the response from RPD to the
user. Figure 1-2 illustrates this
data flow. To support this communication, MGD maintains a management
socket with most daemons. Operational requests and responses flow over
this management socket.
In a few cases, MGD invokes tools to satisfy requests. One example is the Junos upgrade process, which requires more complex handling than a normal CLI command. MGD invokes an external tool to conduct part of the upgrade. Another example is op and commit scripts, which are actually run by the CSCRIPT utility. This process is all transparent to the user, and is just included here for the sake of completeness.
As Figure 1-2 illustrates, some requests must go all the way to the packet forwarding engine (PFE) to be satisfied. Interface statistics are an example of this type of request. To gather interface statistics, MGD invokes ifinfo, which queries the kernel. The kernel often needs to query a PFE to obtain these statistics. In this way, operational commands sometimes have important impacts on the system.
When you commit configuration changes, the new configuration data is placed into a shared database that all the daemons can access. When this configuration database changes, MGD uses Unix signals to signal the appropriate daemons to reread the new configuration. After reading the configuration, the daemons activate its contents. If the new configuration requires changes in the forwarding plane, this data will be propagated to the PFEs. This process is illustrated in Figure 1-3.
The rest of this book assumes some knowledge of the configuration databases and the commit model. Although this is basic information, it is important that you review these concepts to understand portions of the book where it is referenced.
Every Junos system has at least two configuration databases: the committed
configuration and the candidate configuration. As
the names imply, the committed configuration is the
configuration that is currently active, while the candidate
configuration is the copy of the configuration a user is in the process
of editing. As illustrated in Figure 1-5, when a user commits the
configuration, the candidate configuration becomes the committed
configuration and a copy of the new committed configuration becomes the
candidate configuration. The Junos software saves a copy of each
previous committed configuration in case you need to reference these in
the future. (You can reference these saved configurations as the
Note the specific language here: the new candidate configuration is a copy of the new committed configuration. In some cases (notably, when a commit script modifies the configuration as part of a commit), the new committed configuration may not match the candidate configuration at the time the user triggers the commit.
In addition to editing the shared candidate configuration database, there are at least two other ways of editing the configuration: you can edit the shared candidate configuration database in exclusive mode, or you can edit a private candidate configuration.
When you enter
exclusive, the software ensures that you will be the only one making changes
to the shared candidate configuration database. In addition, it
ensures that no one else has previously made uncommitted changes to
the shared database. Finally, it will not let you save changes to
the shared candidate configuration database unless you commit
This behavior makes the exclusive configuration mode quite useful for automated scripts. It ensures they will not interact with other configuration sessions in an unexpected way. Configuration sessions that use NETCONF can obtain equivalent behavior using the lock-configuration remote procedure call. NETCONF is covered in “Management System Internals” and RPCs are discussed in Chapter 2.
configure exclusiveerror: configuration database locked by: user2 terminal p0 (pid 38905) on since 2015-04-25 08:34:03 PDT, idle 00:04:20 exclusive  user1@r0>
Likewise, you will see an error if you attempt to enter the exclusive configuration mode while there are uncommitted changes in the shared configuration database:
configure exclusiveerror: configuration database modified user@r0>
Finally, when you enter the exclusive configuration mode, the software will warn you that you cannot save uncommitted changes in the shared candidate configuration database when you exit. And if you do attempt to exit with uncommitted changes, it will ask you to confirm that you want to discard these changes:
configure exclusivewarning: uncommitted changes will be discarded on exit Entering configuration mode  user@r0#
set system host-name r1 user@r0#
exitThe configuration has been changed but not committed warning: Auto rollback on exiting 'configure exclusive' Discard uncommitted changes? [yes,no] (yes)
Another option for controlling unintended interactions between configuration sessions is to work on private copies of the candidate configuration database (Figure 1-8). This option has some of the same restrictions as working on an exclusive copy of the candidate configuration, except it allows simultaneous editing of the configuration. Once you commit your changes, the changes will be applied to the committed configuration. Similar to the way revision control systems (like SVN or CVS) work, the Junos software can merge nonconflicting changes from multiple simultaneous sessions; however, the software cannot merge conflicting changes.
When you enter
private, the Junos software creates a new, private copy of
the committed device configuration. Your changes go into this
private copy of the configuration. The Junos software will ensure
that no one else has previously made uncommitted changes to the
shared candidate configuration database. (It is fine for others to
be making changes to other private databases.) Finally, it will not
let you save changes to the private candidate configuration database
unless you commit them.
This behavior makes the private configuration mode useful for automation scripts. It ensures that an automated session will not commit changes from another configuration session. And, unlike with the exclusive configuration mode, multiple scripts can simultaneously work on making nonoverlapping changes using the private configuration mode.
As compared to the exclusive configuration mode, there are some different failure modes to consider. For example, the script will probably not be able to automatically resolve conflicts with its changes. And, even though multiple scripts can work on applying changes to their own configuration databases simultaneously, it requires a separate commit to activate the changes from each private configuration database, and the actual commits still happen serially.
This example provides a preview of the XML syntax Junos uses for RPCs. Additional information on RPCs and this XML syntax is provided in Chapter 2.
If another user commits a conflicting change before you commit your changes, you may get a warning when you commit your changes. Here, two users try to configure the same interfaces. The second user gets this output:
set interfaces ge-0/0/0 description "description #2" user2@r0#
set interfaces ge-0/0/1.0 family inet address 192.168.1.1/24 user2@r0#
commit[edit interfaces ge-0/0/0 description] 'description "description #1"' warning: statement exists (discarding old value, replacing with 'description #2') [edit interfaces ge-0/0/1] 'unit 0' warning: statement already exists [edit interfaces ge-0/0/1 unit 0 family] 'inet' warning: statement already exists  user2@r0#
This warning shows that there is a conflict for an item
that can only have one value. An interface can only have one
description. Here, the current description is
description #1. The warning indicates
that the Junos software will replace that description with the
new description (
continues with the commit operation.
This warning shows that there is a conflict for part of
the configuration hierarchy that may be mergeable. This warning
indicates that the Junos software will try to merge
user2’s changes to this configuration
hierarchy with other changes that another user has already made
to the same configuration hierarchy.
When you get a warning like this, the Junos software does
not continue with the commit operation. (This
is indicated by the lack of a
complete message in the CLI, the
<commit-success/> element in
Junoscript output, or the
<ok/> element in NETCONF output.)
When you encounter this situation, you have two options to proceed
further with the commit.
First, you can use the
to merge the changes from the current committed configuration into
your private database. The software does not, generally, overwrite
your changes; rather, it computes the configuration that results
from merging your changes into the current committed configuration
and then installs this into your private database. (Perhaps it would
be helpful to think of this as being analogous to a
git rebase.) You can view the results of
this merge by examining your private configuration database.
Once you have updated your private database, you can commit the changes without getting another warning (unless, of course, another user makes more conflicting changes between the time you update your database and the time you commit the changes).
The other thing you can do when you get a warning like this is
to simply execute the
again. This action causes the Junos software to attempt to merge
your changes into the committed configuration. In general, a second
commit command is not
recommended, as you may not accurately predict the final
configuration that will result from these changes. However, if you
fully consider the consequences, you may choose to use this
You can choose to mix the exclusive and private configuration modes. If you do this, you will be able to open a private configuration database while another user holds an exclusive lock on the configuration database. However, you will be unable to commit changes made to your private database while another user holds an exclusive lock on the configuration database. Instead, you will receive an error. You will need to wait until the user releases the exclusive lock, at which point you can proceed with your commit operation.
In this process, we describe a system with redundant routing engines (REs). In fact, we describe a system with more than two REs, as that system follows the most complicated commit process. To simplify this process for a system with fewer REs (even a single-RE system), simply omit the steps that apply to the other REs.
The steps are as follows:
The master RE runs its commit checks. This may include:
Checking for consistency in the data. (For example, if a BGP group refers to a policy, that policy should be defined.)
Running commit scripts. (Commit scripts can return warnings or errors that are caught at this stage.)
Running daemons in a special mode that conducts more in-depth analysis of the candidate configuration.
The master RE pushes the configuration to the other REs and asks those REs to conduct their commit checks.
The other REs activate the new configuration.
The master RE activates the new configuration.
If an error is detected at any step in the process, the commit process is aborted and the software returns to using the previous active configuration.
This process enforces a contract with the user: Junos will take the time it needs to thoroughly validate that the configuration is acceptable, and in return it will ensure that the Junos software components will be able to parse the configuration at the end of the process. In essence, you are sacrificing time in return for reliability.
As you enter configuration statements, the management daemon validates that each statement is syntactically correct. (Here, “syntactic correctness” refers to each command being properly formed and constructing a valid configuration block.) If it notices a problem, it typically rejects the configuration statement.
Additionally, MGD conducts some semantic checks as you modify the configuration. (Here, “semantic correctness” refers to the state where the entire configuration is coherent and understandable.) If it notices a problem, it typically adds a comment to the configuration to warn you of the problem. In this example, the BGP configuration references an export policy, but the policy it references is not in the configuration:
show protocols bgp exportexport does_not_exist; ## 'does_not_exist' is not defined
Once you commit the configuration, MGD will again conduct its semantic checks; however, these checks will either result in an error or warning, as appropriate. For example, if you attempt to commit the configuration with the BGP export policy still undefined, you will see this output:
commiterror: Policy error: Policy does_not_exist referenced but not defined [edit protocols bgp] 'export' BGP: export list not applied error: configuration check-out failed
If the MGD semantic checks pass, MGD applies any commit scripts that are listed in the candidate configuration. The commit scripts can return warnings or errors. Warnings are displayed to the user but do not impact the commit process. Errors are displayed to the user and abort the commit process. (There is more information about commit scripts in Chapter 5.)
If the commit scripts return no errors, MGD calls the various daemons interested in the changes and asks them to verify that the configuration is semantically correct. These daemons can return warnings or errors. Again, both are displayed to the user, but only errors abort the commit process.
When the RE activates the new configuration, the configuration is merged with other data (such as the platform defaults and transient changes from commit scripts). The new configuration then atomically becomes the new “active” configuration.
show configuration groups junos-defaults
These configuration statements are applied like any other configuration group. That means that they can be overridden by user configuration. (Put differently, configuration groups are applied “behind” the user-supplied configuration, which means that user-supplied configuration can obscure conflicting portions of the default configuration.) This is a fancy and long-winded way of saying that the default configuration statements behave exactly the way you would expect default configuration statements to behave.
Once the configuration is committed, MGD makes any changes it needs to make (such as adding or deleting user accounts, or other changes that MGD is responsible for making) and signals other daemons to read the new configuration and activate the configuration changes each is responsible for. Because the Junos software tracks the changes a user has made, it only needs to signal the daemons that are interested in the parts of the configuration that have changed. Therefore, depending on the exact change, MGD may signal more or fewer daemons to reread the configuration. This means the device may do more or less work for each configuration change, depending on the content of the change.
This behavior becomes more relevant as the size of the configuration grows. For example, it takes much less time for the routing protocol daemon (RPD) to read a configuration with only a single routing instance and 10 static routes than it does for RPD to read a configuration with 1,000 routing instances and 400,000 static routes. Knowing the nature of your commit performance can help you develop smart strategies for handling configuration commits.
You can see this behavior by adding the
| display detail directive to the end of the
command. This causes the CLI to display the details about the activities MGD
undertakes in order to activate the configuration. Among other
details, you should see which daemons MGD signals to reread the
In rare circumstances you may encounter a bug in the logic that
may cause MGD to miss signaling a daemon. (Juniper doesn’t like to see
those bugs, but they can occur occasionally.) In those cases, you can
use the command
to cause MGD to take all the actions to commit the entire
configuration without regard to what has changed. If you do this, MGD
acts as if the entire configuration has changed and all daemons are
signaled to reread their configuration.
Figure 1-9 gives a fuller expansion of the way configuration data, including transient commit script changes and platform defaults, is combined into a “merged view” that the daemons can use to activate the new configuration.
In general, configuration data is applied with this precedence:
Transient changes from commit scripts
The committed configuration (note that this includes any permanent changes made by a commit script)
Configuration applied from configuration groups (which includes platform defaults)
Chapter 5 contains more information on commit scripts, including the difference between transient and permanent configuration changes. For now, it is sufficient to understand that commit scripts can produce different kinds of changes, which will be applied with different precedence.
Actually, to be a little more precise, transient changes from commit scripts, the committed configuration, and configuration groups in the static configuration are merged together. After the data from these various sources is merged together, this “merged” view of the configuration (you can call it the “post-inheritance” configuration) is what the various daemons read when they activate a new configuration.
Configuration groups are only applied to configuration in the static configuration database. They are not applied to transient changes from commit scripts.
Network automation occurs at the interesting intersection of programming and network engineering. There certainly are good programmers who are also excellent engineers (or vice versa), but we need to be honest with ourselves and realize that those few people can probably fit in a fairly small room.
What is much more common is that a network engineer decides to do some programming, or a programmer is tasked with applying his skills to network automation. This happens for various reasons. Perhaps a company asks a network engineer to deploy new services that require automation, such as SDN. Or perhaps a network engineer simply decides that she wants to automate her tasks to save time. Or a company may hire a programmer to implement network automation. However it happens, it is easy to get stuck outside of your comfort zone if you are asked to combine skills you already have with new skills to do something that seems really complicated, like network automation.
Our goal is to help make learning these new skills simpler for you. If you are very familiar with operating Junos devices, we will give you the information you need to use those skills in network automation. Network automation does not need to be complicated. In fact, using the REST API (which we describe in Chapter 3), you can be doing basic network automation in no time. For other topics that are more advanced, you may need to consult outside references to understand the programming languages in use. In this book, we’ll use Python a lot. If you are not comfortable with Python, you might want to consult another book that covers Python, such as Learning Python by Mark Lutz (O’Reilly).
If you are very familiar with programming, we will help you apply your skills to Junos. We describe the methods that you can use to automate with Junos, as well as some of the tools, libraries, and protocols that support these methods. However, we will not be covering Junos fundamentals in detail. To make sure you have some required background information, we covered some of the fundamentals in this chapter. For additional information about Junos, you might want to consult one of the other references available to you. For example, Day One: Junos Tips, Techniques, and Templates 2011, edited by Jonathan Looney et al. (Juniper Networks Books), has information that may be helpful to you as you seek to become familiar with the Junos software.
One piece of advice is to not sacrifice the good for the perfect. There are many choices for automation, and there are many ways that a network can benefit from automation. We encourage you to pick something and start using it to make things easier for you. If you are new to programming, you may find that the REST API provides an easy way to begin automation using not much more than a shell script. If you are familiar with programming but new to the Junos software, you may find that the PyEZ library is the easiest entry point.
Another piece of advice is to keep an open mind. There are a number of tools. This provides you with the flexibility you need to choose the right tool for the right task. It is easy to get tunnel vision and try to solve everything the same way. However, there usually are no “one size fits all” answers when it comes to automation. A commit script might be the right answer for one problem, while a PyEZ script might be the correct solution for a different problem. Therefore, it is best to learn about all the tools and apply them where they are best suited.
Along the same lines, it is worth remembering that tools can be combined. For example, you might use PyEZ to deploy some configuration and use a commit script to expand that configuration. In this way, the two tools can work together to produce the configuration you want.
As you read the rest of the book, we hope you will find it useful to learn about the ways you can save time, improve efficiency, increase accuracy, and make your life easier by automating your network using the Junos software.