Automating Ansible Role Management with Python
Table of Contents
Mydex CIC describes their way for optimising Ansible roles and playbook management using a Python script integrated in a Jenkins pipeline.
#
Introduction
Managing roles and playbooks in Ansible can become complex and time-consuming, especially as your infrastructure grows. Over time, due to natural evolution of your platform (through upgrades, consolidation or changes in approach), it may be that some of your Ansible roles become obsolete and are no longer being used at all, offering an opportunity to clean up your codebase. It may also be that later playbooks written to solve adjacent tasks make use of the same role as another playbook. This might identify opportunities to refactor or merge playbooks. All in all, keeping your configuration management lean and trimmed of unneeded components is an excellent way of staying on top of technical debt at the infrastructure level.
#
Benefits of Automating Role Management
- Time Efficiency: By automating the process of checking roles and playbooks, you save significant time that would otherwise be spent manually reviewing and cross-referencing files.
- Accuracy: Automation reduces human error. The script ensures that every role is checked against every playbook. When there are a lot of playbooks or a lot of roles, one or more of them can be easily overlooked when reviewed manually.
- Clean and Organized Codebase: Regularly identifying and removing unused roles helps keep your codebase clean and efficient, which is crucial for maintaining an already complex platform. It may even help reduce your attack surface if there is less code (and potentially less dependencies) there.
- Preventing Redundancies: The script identifies roles that are referenced in multiple playbooks, allowing you to address potential redundancies and optimise role usage.
#
Scripting a check of Ansible roles
The following Python script automates the process of checking which roles are used in your Ansible playbooks, identifying unused roles, and ensuring that roles are not duplicated across multiple playbooks. This automation not only saves time but also helps maintain a clean and efficient Ansible setup.
Current version of the script
1#!/usr/bin/env python3
2
3import os
4import yaml
5
6def get_roles(directory):
7 return {
8 f
9 for f in os.listdir(directory)
10 if not f.startswith(".")
11 }
12
13def parse_playbooks(directory):
14 playbooks = {}
15 for filename in os.listdir(directory):
16 if not filename.startswith("."):
17 with open(os.path.join(directory, filename)) as f:
18 playbooks[filename] = yaml.safe_load(f)
19 return playbooks
20
21def check_roles(playbooks, roles_list, message):
22 print(f"\n >>> {message} {len(roles_list)} roles... \n")
23 used_roles = set()
24 doubles = set()
25
26 for role in roles_list:
27 for playbook_name, playbook_data in playbooks.items():
28 for item in playbook_data:
29 if "roles" in item:
30 roles = item["roles"]
31 if role in roles:
32 if role not in used_roles:
33 used_roles.add(role)
34 else:
35 doubles.add(role)
36 print(f"Found role '{role}' in playbook '{playbook_name}'")
37
38 unused_roles = roles_list - used_roles
39 if not unused_roles:
40 print(" All roles are used in at least one playbook!")
41 else:
42 for role in unused_roles:
43 print(f"==== UNUSED ROLE ==== '{role}' not used by any playbook!")
44
45 if doubles:
46 print(f" {doubles} referenced in multiple playbooks!")
47
48 return unused_roles, doubles
49
50
51script_directory = os.path.abspath(os.path.dirname(__file__))
52parent_directory = os.path.dirname(script_directory)
53ansible_directory = os.path.join(parent_directory, "ansible")
54
55playbooks = parse_playbooks(os.path.join(ansible_directory, "playbooks/"))
56roles_ls = get_roles(os.path.join(ansible_directory, "roles/"))
57roles_ext_ls = get_roles(os.path.join(ansible_directory, "roles-external/"))
58
59unused_roles, doubles = check_roles(playbooks, roles_ls, "Checking")
60unused_roles_ext, doubles_ext = check_roles(playbooks, roles_ext_ls, "Checking external")
61
62if unused_roles or unused_roles_ext:
63 raise SystemError("Found unused role(s)! Maybe it's time for a clean-up...")
#
Speed and Performance
Our first version of the script had some unnecessary complexity in its logic. For each role and each playbook, the script opened the playbook file and loaded its content with yaml.safe_load
. This meant that if there are n roles and m playbooks, the script would open and load each playbook file n times, leading to n * m file operations.
Also, the old script used lists for storing and checking roles. Checking if an element is in a list slows down the processing (and probably increases RAM consumption) as the list grows.
Old version of the script:
1#!/usr/bin/env python3
2
3import os
4import yaml
5
6
7roles_ls = [
8 f
9 for f in os.listdir("../ansible/roles/")
10 if not f.startswith(".")
11]
12
13roles_ext_ls = [
14 f
15 for f in os.listdir(
16 "../ansible/roles-external/"
17 )
18 if not f.startswith(".")
19]
20
21playbooks = [
22 f
23 for f in os.listdir(
24 "../ansible/playbooks/"
25 )
26 if not f.startswith(".")
27]
28
29
30# How many roles do we have?
31print(f"\n >>> Checking {len(roles_ls)} roles... \n")
32
33roles_used_list = []
34roles_doubles_list = []
35
36for d in roles_ls:
37 for p in playbooks:
38 with open(
39 f"../ansible/playbooks/{p}"
40 ) as f:
41 full_file = yaml.safe_load(f)
42 for item in full_file:
43 for i in item:
44 # Find the role
45 if i == "roles":
46 roles = item[i]
47 for r in roles:
48 # Checking...
49 if r == d:
50 if r not in roles_used_list:
51 roles_used_list.append(r)
52 else:
53 roles_doubles_list.append(r)
54 print(f"Found role '{r}' in playbook '{p}'")
55
56# Comparing...
57roles_unused_list = [r for r in roles_ls if r not in roles_used_list]
58if roles_used_list == roles_ls:
59 print(" All roles are used in at least one playbook!")
60
61if roles_unused_list:
62 print(f" {roles_unused_list} not used by any playbook")
63
64if roles_doubles_list:
65 print(f" {roles_doubles_list} referenced in multiple playbooks!")
66
67
68# How many subroles do we have?
69print(f"\n >>> Checking {len(roles_ext_ls)} external roles... \n")
70
71roles_ext_used_list = []
72roles_ext_doubles_list = []
73
74for d in roles_ext_ls:
75 for p in playbooks:
76 with open(
77 f"../ansible/playbooks/{p}"
78 ) as f:
79 full_file = yaml.safe_load(f)
80 for item in full_file:
81 for i in item:
82 # Find the role
83 if i == "roles":
84 roles = item[i]
85 for r in roles:
86 # Checking...
87 if r == d:
88 if r not in roles_used_list:
89 roles_ext_used_list.append(r)
90 else:
91 roles_ext_doubles_list.append(r)
92 print(f"Found external role '{r}' in playbook '{p}'")
93
94# Comparing...
95roles_ext_unused_list = [r for r in roles_ext_ls if r not in roles_ext_used_list]
96if roles_ext_used_list == roles_ext_ls:
97 print(" All subroles are used in at least a playbook!")
98
99if roles_ext_unused_list:
100 print(f" {roles_ext_unused_list} not used by any playbook")
101
102if roles_ext_doubles_list:
103 print(f" {roles_ext_doubles_list} referenced in multiple playbooks!")
The current version of the script is designed to run quickly, even with a large number of roles and playbooks. Python’s efficient file handling and YAML parsing ensure that the script performs well without significant lag. The use of sets for role management further enhances the speed by providing fast membership testing.
This is the time spent running the old script:
real 0m2.164s
user 0m2.076s
sys 0m0.045s
The time spent running the new script is now 4 times faster than it used to take!
real 0m0.079s
user 0m0.054s
sys 0m0.023s
Depending on how many roles and playbooks you have, this could amount to a significant speed increase and reduction in compute power.
#
Integration with Jenkins for Continuous Monitoring
Integrating this script with Jenkins allows for continuous and automated monitoring of your Ansible setup. By configuring Jenkins to run this script periodically, you can ensure that any unused or redundant roles are promptly identified and addressed. Here’s how you can set this up:
- Clone your Ansible Repository: Configure a Jenkins job to periodically clone your Ansible repository. This ensures that the script always runs against the latest version of your playbooks and roles.
- Run the Script: Add a build step to execute the Python script. Ensure that the script is executable and that Jenkins has the necessary permissions to run it.
- Handle Output: Configure Jenkins to handle the output of the script. If the script raises a
SystemError
due to unused roles, Jenkins can send a notification to the relevant team members for further action. - Schedule the Job: Set up a periodic schedule for the Jenkins job, such as nightly or weekly, depending on how frequently your Ansible setup changes. Depending on how your workflow and network topography works, you might also be able to trigger a webhook to launch a job when changes occur in your repository, rather than poll it from the other side.
Here’s a sample Jenkins pipeline script to integrate the Python script:
1pipeline {
2 agent any
3
4 stages {
5 stage('Clone Repository') {
6 steps {
7 git 'https://your-git-repo-url/your-repo.git'
8 }
9 }
10 stage('Run Role Checker') {
11 steps {
12 sh 'path/to/your/script.py'
13 }
14 }
15 }
16 post {
17 failure {
18 mail to: 'team@example.com',
19 subject: "Unused Ansible Roles Detected",
20 body: "There are unused Ansible roles detected in the latest check. Please review and clean up."
21 }
22 }
23}
#
Conclusion
Automating the management of Ansible roles with a Python script brings significant benefits in terms of time efficiency, accuracy, and codebase cleanliness. Integrating this script with Jenkins further enhances its utility by providing continuous monitoring and immediate feedback on the state of your roles and playbooks. The convenience of Jenkins is that it can easily be designed to send notifications/alerts only if there is a problem (e.g as a Build Failure). This helps avoid ‘alert fatigue’ by ensuring that only problems are brought to the engineers’ attention. Otherwise, the ‘No News Is Good News’ mantra is assumed. This automation not only streamlines your workflow but also ensures that your Ansible setup remains robust and maintainable.