Network Lifecycle Management with Hierarchical Configuration

In a previous blog, I hinted at a network configuration life cycle management library called hierarchical_configuration. I’ve been meaning to write about it for a while, but we’ve been super busy at work. I also wanted to ensure that we get our latest version of the library out in the public for general consumption before I wrote about it.

As your fleet routers and switches grow, it becomes pretty natural to place these devices into a set of categories. For example, core, aggregation, and access. Each of these categories typically have a standard configuration. Hopefully each of these standard configurations exists as templates, so that new deployments can be rolled out quickly. But, what about making changes to the templates? Do you make changes to these templates, then continue to roll them out to new deployments, leaving the existing install base with an outdated configuration? Or do you return to the install base and remediate the devices with updated configurations? What if you have thousands of devices? This has been a problem that my colleagues and I have set out to solve. This is how hierarchical_configuration has evolved.

So, what is hierarchical_configuration? hierarchical_configuration is a python library that allows you to compare the running configuration and the intended configuration from a network device, then generate a set of commands that it will bring the network device into compliance with the intended configuration. hierarchical_configuration also has an extensive configuration file, so that you can define how specific commands or sections of commands get remediated.

Most utilities that performs a similar function as hierarchical_configuration, apply command remediation by negating a command, then applying the new command. For instance, if you wanted to change the interface description of an interface, most utilities will do something like:

That works, but it’s wasteful on CPU cycles, which slows down the over all application run time when you are attempting to apply interface descriptions to thousands of interfaces. What if the command was something that could be impactful, if it were negated? Maybe something like changing ‘transport input ssh telnet’ to ‘transport input ssh’, under your line vty? Negating the command could potentially cause you to lose management access.

hierarchical_configuration gives you several configuration options for dealing with such scenarios. You define those as a YAML file under hier_options. Here is a sample of hier_options:

 

Lets break down the individual sections of hier_options. The first section is ‘sectional_overwrite’. sectional_overwrite does exactly like it sounds. It over-writes an entire section of configuration if there is a change. In the example, it tags ipv6 access-lists as a section of code that should use sectional_overwrite.  If any changes are made to the intended configuration for ipv6 access-list, then hierarchical_configuration over writes the entire section of configuration, rather than targeting individual lines of children configuration in the section.

The next section is ‘ordering’. Ordering is a very handy configuration option. It allows you to weight the order in which commands are presented in hierarchical_configuration. The default weight is 500. The smaller the number, the higher up in the configuration the commands are presented. While the commands tagged with larger numbers are presented lower in the configuration.

For instance, assume that you have an access-list called TEST, which is applied to Ethernet0/1:

Let’s say that you want to create a new access-list called TESTING and apply it to Ethernet0/1, rendering the access-list TEST as un-needed. When you go to apply the configuration, you don’t want to remove the access-list TEST before you’ve created access-list TESTING and applied it to interface Ethernet0/1. Doing so may impact traffic that is flowing across the interface. The preferable order of operation is:

  1. Create the new access-list
  2. Apply the new access-list to the interface
  3. Remove the old access-list

To do so, you will want to the command ‘no ip access-list’ closer to the bottom of the list of commands. You would do this, by setting the order of the negation of ip access-list higher than 500.

In this example, any command generated that starts with ‘no ip access-list’ gets tagged with an order of 525, which moves that section of configuration lower into the generated config. The generated configuration would look like:

The next two sections are ‘full_text_sub’ and ‘per_line_sub’. When you pull a running config from a device, it will typically contain some fluff, such as:

That kind of text is just back ground noise, when we are attempting to determine the difference between the running config and the intended config. So, per_line_sub attempts to resolve that by ignoring it when comparing the configurations.

As you can see, it will find any line that contains ‘Building configuration” and replace it with no data, effectively deleting the line. full_text_sub performs a similar task, but for entire sections of code. In our example, we ignore the banners on the device, as those can have information that is unique to the device.

Hierarchical_configuration understands sections of config. It does this by assuming that a line of configuration that doesn’t have any indentation to be a parent and any lines of configuration under the parent that have indentation are children of the parent. When the config reaches another line without indentation, the section of configuration ends. An example would be an interface configuration.

In some cases, there will be multiple tiers of parent / child configurations. An example would be peer templates in BGP.

With ‘sectional_exiting’, you can define sections of configuration that have sub-children of children, as explained above.

idempotent_commands’ is the section where you define what should be over-written, rather than negated, then re-applied with new configuration. Commands such as hostname, description, ip address, etc should all be over-written, rather than negated.

Another very handy set of options is command tagging, which resides in a different config section from hier_options, called hier_tags. Being able to tag commands allows you to generate remediation commands which target very specific commands, such as creating a new access-list, applying it, then removing the old access-list. We’ll continue to use the examples that I’ve have above with replacing the access-list TEST with TESTING. The first thing we need to do is set up our tagging, which will look like:

Assume that your running config is:

and your intended config is:

We can use hierarchical_configuration compare the two configurations, make the appropriate configuration tag (NEW_ACL), then spit out a configuration plan based on the NEW_ACL tag.

Here is a sample output of the configuration comparison:

As you can see, only the parts of the config pertaining to the TESTING access-list are tagged with the NEW_ACL tag. We can now generate a config plan based on that tag.

Here is the script that produced the output:

Let’s break down the script. The very first thing we do is import yaml and hierarchical_configuration:

Next, we read the hierarchical_configuration config file and define the options and tags variables:

Now, we define an instance of hierarchical_configuration for the running config and load the running config from a file:

Then, we do the same for the intended config:

Once that is done, we can perform the comparison:

Now, we load the ordering, sectional exiting, and tags options:

At this point, everything is loaded into memory. The next two portions of code are simply to have a visual of what is happening. The first portion is a for-loop, which displays the raw list of dictionaries:

Finally, the finished product. Generating a config plan, based on the NEW_ACL tag.

As you can see, hierarchical configuration is a very powerful life-cycle management tool for network gear. We’ve been using it successfully on IOS, IOS-XR, IOS-XE, NX-OS, and EOS devices. It has made our work less risky – from an outage perspective – more consistent, and allows us to automate and move faster than we have in previous years.

The code is available on Github here.

Share on FacebookTweet about this on TwitterShare on LinkedInShare on RedditEmail this to someone

July 8, 2016

Posted In: Cisco Administration Python Scripting, DevOps, Network DevOps, Network Programmability, Python Tips