This week I’m starting a series of posts that revolve around my first experience with creating a network automation CI/CD pipeline in GitLab (a.k.a. Infrastructure-as-Code). I’m not sure how many parts there will be in total, but I figure I could document my experience throughout the process.
Disclaimer: All credit goes to @NWMichl for his great series of Infrastructure-as-Code blog posts. I used them as reference to setup my first pipeline (the one I’ll be discussing in this post). I do not intend for this series to be a how-to (there are already many great ones out there!). This series will be more about my experience building my first CI/CD pipeline and any additional tweaks I made during the process. First, let’s check out the list of tools and resources I used in my workflow.
- Manage and deploy code using a CI/CD pipeline
- Ubuntu 20.04 (Devbox)
- Installed with Ansible 2.9
- EVE-NG Professional
- Google Cloud Platform (GCP)
- Hosting my Devbox and EVE-NG instances
Now that you know the list of tools and have a visual diagram, I’ll quickly run through the above workflow. First, all my code and CI/CD operations are managed in GitLab. If you haven’t used GitLab, it’s essentially an alternative to GitHub. The more interesting part of my topology resides in Google Cloud (GCP). I have two VM instances deployed: Devbox and EVE-NG Pro. The Devbox is an Ubuntu 20.04 instance that has Ansible 2.9 installed on it. I’ll be using this instance as my Ansible control node – the box that runs the Ansible playbooks. The EVE-NG Pro instance is pretty self-explanatory. It is a VM running EVE-NG Professional. For more information on how to deploy EVE-NG in GCP, check out EVE-NG’s official documentation here or Knox Hutchinson’s video here. I also found a blog post series by OpenEye Software that helped me out a lot. Here’s a link to that blog series.
So how does all these pieces work together? I added a few arrows in the diagram to display the actions that would take place in a typical workflow, but let’s walk through them:
- A user would make a change to the code repository and create a pull request to push their changes to the ‘master’ branch.
- The pull request would be reviewed and approved.
- Once approved, the changes are pushed to the ‘master’ branch. As a result of the changes made to the ‘master’ branch, a CI/CD pipeline workflow is triggered, and the jobs outlined in the .gitlab-ci.yml file within the code repository will be ran.
- GitLab will run the pipeline jobs on a ‘runner’, which in my case is the Devbox in GCP. All job results will be shown in real-time in GitLab.
- GitLab will show the status of each job within the pipeline and whether the pipeline passed.
As a reminder, this CI/CD pipeline will automatically run every time a change is pushed to the ‘master’ branch. The trigger can be customized, but for simplicity, I kept the default settings.
Creating the CI/CD Pipeline
As mentioned in the beginning of this post, I followed @NWMichl blog series on creating this CI/CD pipeline, so I will not go through the content verbatim since you can follow the series here. I will focus more on my experience while trying to replicate the workflow.
At a high-level, I’m managing the configuration of a set of Cisco IOS routers in EVE-NG using Ansible. In my repo, I have an Ansible inventory file, group_vars, and one playbook. For those not familiar with group_vars, they are basically default values that should be applied to a group of devices. As it relates to networking, think about your network services – Syslog, NTP, SNMP, etc. The configuration for these services are normally the same across a large group of devices, so it would make sense to keep them in a group_vars file. In my workflow, I only define a couple syslog servers, but I plan to expand in the future.
Setting up an EVE-NG environment in GCP is pretty straightforward, but the difficulty came when I tried using a separate VM instance (Devbox) within the VPC to control the Cisco IOS routers in EVE-NG. Luckily, I used a blog post from OpenEye Software to help me create a management network using the Cloud nodes in EVE-NG. Check out the post here. The only issue I ran into, which I later found to be clearly documented, was that I forgot to enable IP Forwarding when creating the EVE-NG instance in GCP. IP Forwarding is a setting in GCP that toggles strict source/destination checking. By enabling IP Forwarding, it allows VM instances to forward packets that are not destined to itself. Due to the management network in EVE-NG being a different subnet from GCP’s internal networks, we need IP Forwarding enabled on both the Devbox and EVE-NG VMs to allow communication. This seems like a small detail, but I had to rebuild my entire EVE-NG VM, since this setting can only be tweaked when first creating a VM in GCP. I find this to be a limitation in GCP, as you can disable strict source/destination checking at the network interface level of a VM in AWS and Azure post-deployment. Once I was able to establish communication between the Devbox and the IOS routers in EVE-NG, I turned to creating the .gitlab-ci.yml file in GitLab.
The .gitlab-ci.yml file in GitLab outlines the jobs and actions that should be done within the CI/CD pipeline for the code repository. As far as the CI/CD pipeline jobs are concerned, I’m essentially performing the following tasks: linting, testing, and deploying to production. I use yamllint and ansible-lint to confirm the syntax in my Ansible inventory, vars, and playbook files. These linting jobs failed at various points throughout my experience, mostly due to incorrect spacing. If I can warn anyone, please check your spacing in all your YAML files! Another warning, if you are using an IDE like VS Code, ensure it doesn’t automatically add spacing to the files you edit. By default, some IDEs add additional whitespace. Besides the GCP communication and YAML spacing issues, creating the Ansible playbook and .gitlab-ci.yml file was straightforward. Let’s move on to the last part: Getting the ‘runner’ to run…
Run, Runner, Run!
The last and, perhaps, the trickiest part of getting the CI/CD pipeline up and running was installing the GitLab Runner on the Devbox. You can install the Runner on any OS (Windows, MacOS, Linux) or in a container (Docker, Kubernetes, OpenShift). Since the Runner must communicate with GitLab, you must ensure GitLab.com is reachable from the Runner, which you think would cause the majority of issues. However, my issue came during the installation process. I installed the runner using ‘apt install’. After installation, I registered the runner, which generated a config.toml file. The config.toml file is the config file for the gitlab runner. You can specify settings for the runner, such as the type of executor the runner should use. In our case, we are running shell commands in our CI jobs (i.e. ‘ansible-playbook play1.yml’ in the shell), so we have configured the executor as shell. After registering the runner, I was receiving errors that paramiko was not installed when trying to run certain plays in my Ansible playbook. I turned on debugging and found that the job was received and then failing immediately without a traceback. I could conclude that communication between the Devbox and GitLab was there, so I began looking at the communication between the Devbox and the IOS routers in EVE-NG. I ran the Ansible playbook locally on the Devbox and it executed successfully. Now I was stumped… If paramiko was not installed, the playbook shouldn’t be able to successfully run locally on the Devbox. Ultimately, I created a Python virtual environment, reinstalled Ansible, and added a section in my runner’s config.toml file to run a set of shell commands before every job it receives. The set of shell commands activates the Python virtual environment and places the runner into directory with the playbook. This isn’t the best or most scalable approach, but I was just trying to get it working at this point in my long day of troubleshooting. Finally, I reran the pipeline and received the result I was looking for all day:
I hope I was able to clearly describe my first experience creating a CI/CD pipeline without too much rambling. I wanted to write this series so that I can connect with others as they go through the struggles of any network automation project (in this case, creating a CI/CD pipeline). I want to help others discover what’s possible, along with outlining the struggles and issues I ran into along the way. Over the next couple weeks, I plan to expand on this initial CI/CD pipeline example and potentially create something that can used in production. Look forward to Part 2 next week! Thanks for reading!