Welcome to Part 2 of my CI/CD pipeline series! In this post, I will be going over the improvements I’ve made since Part 1, some issues I ran into along the way, and the overall goal(s) of this project.
Let’s start with the project goals. One of my main goals is to help provide clarity with CI/CD pipelines, as it pertains to network automation, and document the wins/losses along the way. I think we can agree that experiencing NetDevOps and Infrastructure-as-Code (IaC) is a pretty refreshing experience. There are a lot of new tools and processes learned along the way. However, we need to understand why we would do something different and go through the experience ourselves before touting it. I hope to describe my experiences (good and bad) through this series, and also why you and your team would consider adopting these new workflows.
The more technical goal of this project is to build a tool that can used as a proof-of-concept (PoC) or demo for a network team that is first being introduced to Infrastructure-as-Code. With that being said, I will continue refactoring the logic and code as I learn more about CI/CD and Ansible. From this post forward, I will also begin providing more technical details of what’s being updated in the project.
So What’s New?
As mentioned in Part 1, I mostly followed a tutorial to build the initial pipeline. In Part 2, I expanded on that by adding the following:
- Ansible host and group vars
- Ansible roles
Along with the the additions, I also ran into some issues, but I’ll get into that later in the post. First, let’s jump into the details of what’s changed, specifically with Ansible.
Diving into Ansible
Before going into the specific changes, here’s a current snapshot of my repository:
I plan on reviewing and making my repo public very soon (stay tuned to my Twitter!), but I figure I would include my current repo structure and use it as reference throughout this post.
Host and Group Vars
One of the first improvements I made was to break out the variables for each host and group of hosts. This helped modularize my playbooks and allows for future scalability. Take the below host_vars file and related task as an example:
In the above task, a Loopback interface is configured. However, to make this task scalable, we use inherited variables from each device’s host_vars file. This is super important because now we have one playbook that can configure different Loopback interfaces with different IP addresses on every device in our inventory. For example, we could add another item to the ‘loopbacks’ list (i.e. lo1: 10.254.11.1) in the example host_vars file and just re-run the task to configure it. This becomes crucial when you have hundreds of devices with potentially different requirements based on design, geography, etc. With IaC, the host_vars file becomes a source of truth for what’s configured on the device (assuming no one goes rogue). You can take these same concepts and apply them to group_vars. For example, you can inherit the same SNMP server settings or syslog configuration from a specific group of devices in a playbook based on physical geography, type of device, device vendor, or any other arbitrary criteria. Now, let’s move on to roles!
This is my first time using the concept of roles in Ansible, so please do not take any of my instructions or code as best practice. I’m still in the beginning of my journey learning the advanced features of Ansible. As I was reading through the docs, I noticed that many of the examples showed different services being described as roles. For example, the Apache service could be considered part of a ‘webserver’ role. Since I’m just starting out, I figure I would make it as simple as possible and begin creating roles that revolved around pieces of a network configuration. In my example, I created the following roles: ‘device_mgmt’ and ‘eigrp’. The ‘device_mgmt’ role will consist of features around device management: Loopback interfaces, SNMP settings, NetFlow, etc. The ‘eigrp’ role is pretty self-explanatory.
Ultimately, I want to create roles for specific device types in a network (WAN routers, L3 switches, L2 switches, etc.) and shift my current roles (device_mgmt and eigrp) into dependencies of the device type roles. For now, I will be using my current roles in playbooks to configure specific device types. Here’s an example:
The above playbook is designed to configure a device (or group of devices) as a WAN router using the roles shown. As you can see, this approach is the easiest to digest and understand the different configuration items being applied (with the use of roles). I’m running this playbook against a generic group called ‘routers’, but you can imagine the different ways devices can be grouped in your inventory file (i.e. by physical geography, device vendor, role in network, etc.).
Other Misc. Changes/Issues
Besides applying new Ansible concepts, I did make some other small tweaks and changes related to the CI/CD pipeline. I had to build a new VM to use as a GitLab Runner. My existing devbox VM was having issues starting the GitLab Runner service on startup, which really held me up for a couple hours. Ultimately, I found that there was an open bug in GitLab for the service not starting on Ubuntu 20.04. As a result, I built a new VM using an Ubuntu 18.04 image, which took about 20 secs, since I’m running my entire infrastructure in GCP (perks of the cloud!).
Since I had to build a new runner, I decided to take a look at my .gitlab-ci.yml file and added a new ‘before_script’ job that activates a Python virtual environment before every job is ran. This helps eliminate any potential issues that could occur if I was using the system Python interpreter and also helps control the packages installed in the Python environment. I also added more linting tasks to lint all the YAML files in the host_vars and groups_vars directories.
The last change I made to the repo was add an Ansible configuration file locally to the repo (ansible.cfg). This allows Ansible to use the same configuration when running through the jobs in the CI pipeline.
This wraps up Part 2 of my CI/CD Pipelines series. In Part 3, we will look at adding additional configuration (i.e. NetFlow, Usernames, etc.), expanding on our existing roles, and adding some more controls to the CI/CD process, including requiring manual intervention before proceeding to the ‘deploy’ stage. Thanks for reading and stay tuned!