Dernière mise à jour : 8 sept. 2021
Ansible is a well known open source automation engine which can automate, provision, handle configuration management and orchestration. As it doesn’t need an agent by using SSH protocol, and because you don’t need to write code using simple modules, Ansible eases the deployment and management of your applications !
Before discussing how we can optimize your Ansible configuration, here is a quick reminder on how it works. You can see on the following picture an “Ansible Management node”. This host perform operations on target infrastructure by pushing configurations through playbooks and roles. Each host called in the playbook will be configured as you expect them to be. Finally, hosts can be organized in groups through an inventory file which helps us decide which hosts we are controlling, when and for what purpose :
We can use Ansible for so many tasks as provision virtual machines, apply configurations or even patch them.
However, in some contexts like continuous delivery, having fast Ansible scripts (called playbooks) is required to get rapid feedback as well as reducing the possible Ansible load on the target servers.
In this article, we are going to see important concepts which can provide Ansible with a great performance, and finally go through some benchmarking to quantify the possible improvements.
Good practices to speed up playbooks
Yum calls are expensive !
One important Ansible module is the yum module. You can use it to install, upgrade, downgrade, remove, or list packages and groups with the yum package manager (or apt for debian). A common issue is to invoke several times the same module in multiple tasks like so :
- name: install the latest version of nginx yum: name: nginx state: latest - name: install the latest version of postgresql yum: name: postgresql state: latest - name: install the latest version of postgresql-server yum: name: postgresql-server state: latest
Yums are expensive ! Ansible is smart and knows how to group yum or apt transactions to install multiple packages into a single transaction, so it’s a huge optimization to install all the required packages in a single task :
- name: Install a list of packages yum: name: - nginx - postgresql - postgresql-server state: present
Avoid using Shell or Command modules
To run a shell command on an Ansible host, you can use modules like shell or command. Both are really time consumers as we will see in the benchmark. Always check if there isn’t a more appropriate module :
- name: Create a directory (BAD WAY using a shell command) shell: mkdir /tmp/sokube - name: Create a directory (GOOD WAY using a module) file: path: /tmp/sokube state: directory
It won’t be just faster but it will also leverage the idempotent property of the modules. It means that after 1 run of a playbook to set things to a desired state, further runs of the same playbook should result in 0 change. In simpler terms, idempotency means that Ansible playbooks can be executed several times without any side effects so that consistency of the environment is maintened.
We will see in our benchmark how efficient it is when you use modules, instead of shell commands.
Select the best Strategy
When running a playbook, Ansible uses a strategy that is basically the playbook’s workflow. It’s important to select the correct strategy if we want to improve efficiency. The default one is linear: it will run each task on a number of hosts and wait for each task to complete before starting the next one.
If the target is independent, we can consider the “free” strategy. Tasks will be processed independently on the status of tasks on other hosts, as explained in the following picture :
We can define custom strategies by developing plugins or use existing plugins like mitogen, which we will discuss later on this page.
Forks define the maximum number of simultaneous connections Ansible made on each task. It will help you manage how many hosts should get affected simultaneously. By default, the parameter is 5, which means that only 5 hosts will be configured at the same time. We can improve that value as far as it doesn’t interfere with your infrastructure’s resources.
Forks can be configured in the the ansible.cfg file:
Configure Async tasks
By default Ansible runs tasks synchronously, holding the connection to the remote node open until the action is completed. When the task is truly independent, that is no other task is expecting to be finished to get started, defining the task as asynchronous can truly optimize the overall execution, as show in the below example:
--- - name: My Playbook to test Async and Poll hosts: webservers tasks: - name: Copy the script from Ansible host to node for testing copy: src: "my-longrunning-script.sh" dest: "/tmp" - name: Execute the long running script shell: "chmod a+x /tmp/longrunningscript.sh && /tmp/my-longrunning-script.sh 60" # Run for 60 seconds async: 120