Throttling Ansibles Localaction Module

Author: Sam Griffith

‘Why would you ever want to limit the power of Ansible?’ you may be asking.

Well, I never thought that I would want to until I found a very good reason: racing.

Here is the playbook that convinced me to look into limiting Ansible:

---
- name: Retrieve System Resource Information
  hosts: all
  strategy: linear
  tasks:
  - name: Apt install sysstat
    apt:
      name: sysstat
      state: present
    become: yes

  - name: total_mem
    shell: free -h | grep Mem | awk '{print $2}'
    register: total_mem

  - name: available_mem
    shell: free -h | grep Mem | awk '{print $7}'
    register: available_mem

  - name: idle_cpu
    shell: mpstat| grep all|awk '{print $13}'
    register: idle_cpu

  - set_fact:
      stats: "{{ ansible_hostname }} Memory: {{ available_mem.stdout }}/{{ total_mem.stdout }} CPU Available: {{ idle_cpu.stdout }}"

  - debug:
      var: stats

  - local_action: file name=../current_stats.txt state=touch
    become: no

  - local_action: lineinfile insertafter=EOF line={{ stats }} dest=../current_stats.txt
    become: no

If you can read Ansible playbooks, you can see that this playbook performs a series of rather simple tasks:

  • Install a package called sysstat (necessary for mpstat command) using apt install
  • Three shell commands getting the total_mem, available_mem, and idle_cpu percentage
  • Creating a variable, stats, for later use
  • Debug (print to screen) the stats variable
  • Touching a file on the local machine, ../current_stats.txt
  • And adding a new lineinfile for each instance of the stats variable (1 per host)

Everything in this playbook was working exactly as expected–other than the last task.

I was seeing very odd behavior when using the local_action command to save the results of my basic resource getter playbook.

When I would cat out the file I wrote to on my local machine, I would see a variable number of lines for a set number of hosts.

In other words, when running this playbook against 8 hosts, I should expect to see 8 lines of output. Running it several times to verify the behavior, I saw anywhere from 2 to 7 lines of output.

Run 1:

sumi-08 Memory: 198G/251G CPU Available: 97.48 VMs: 107    
sumi-07 Memory: 198G/251G CPU Available: 97.55 VMs: 107  

Run 2:

sumi-04 Memory: 232G/236G CPU Available: 99.98 VMs: 3
sumi-05 Memory: 248G/251G CPU Available: 99.98 VMs: 3
sumi-03 Memory: 162G/251G CPU Available: 94.71 VMs: 183
sumi-07 Memory: 198G/251G CPU Available: 97.55 VMs: 107
sumi-08 Memory: 198G/251G CPU Available: 97.48 VMs: 107
sumi-06 Memory: 225G/251G CPU Available: 94.21 VMs: 6
sumi-02 Memory: 249G/251G CPU Available: 99.99 VMs: 2

Run 3:

sumi-01 Memory: 248G/251G CPU Available: 99.98 VMs: 3
sumi-03 Memory: 162G/251G CPU Available: 94.71 VMs: 183
sumi-08 Memory: 198G/251G CPU Available: 97.48 VMs: 107
sumi-05 Memory: 248G/251G CPU Available: 99.98 VMs: 3
sumi-06 Memory: 225G/251G CPU Available: 94.21 VMs: 6  

This means that Ansible was in a racing condition against itself!

The Issue

The lineinfile module was attempting to write my stats to the end of the file, EOF. But Ansible is fast. Super Fast. AND, Ansible is specifically designed to run in parallel. It wanted to write to the end of the file with as many process forks as it could use to get the job done quickly.

BUT, this was overwriting some of the lines that Ansible was attempting to write into the file.

I found myself in a situation where I wanted the code to execute slower.

The Fix

There are several ways you are able to slow Ansible down purposefully. The first one that comes to mind is to limit the number of forks. By default, Ansible gives you 5 process forks to work with, which allow your processor(s) to handle multiple python calls in parallel. Generally, I turn my forks up to 200 as I am running Ansible from a fairly beefy server.

Theoretically, I could alter the number of forks down to 1 so Ansible would be forced to perform all of it’s tasks serially. However, that is a very drastic move, and generally should never be done. This would slow down the entire playbook, not just the one racing task.

To narrowly focus on slowing down Ansible on the task level, I found the easiest way is to simply use the throttle keyword modifier. This can be applied either on a task or a play level, and will limit the number of workers to however many you say it can use.

Note: throttle does not have the ability to raise the number of forks during a play or task. It is only used to effectively reduce the number of workers below the pre-set fork limit.

Now, take a look at my last task, using the throttle keyword.

  - local_action: lineinfile insertafter=EOF line={{ stats }} dest=../current_stats.txt
    become: no
    throttle: 1

By adding this one task level keyword modifier, I slowed the single task down and eliminated the racing condition that I was seeing.

All of the subsequent tests showed a full 8 out of 8 lines of output. And as an added benefit, they were all saved in the order of the hosts!

New (and consistent!) output:

sumi-01 Memory: 248G/251G CPU Available: 99.98 VMs: 3
sumi-02 Memory: 249G/251G CPU Available: 99.99 VMs: 2
sumi-03 Memory: 161G/251G CPU Available: 94.71 VMs: 183
sumi-04 Memory: 232G/236G CPU Available: 99.98 VMs: 3
sumi-05 Memory: 248G/251G CPU Available: 99.98 VMs: 3
sumi-06 Memory: 226G/251G CPU Available: 94.19 VMs: 6
sumi-07 Memory: 174G/251G CPU Available: 97.45 VMs: 107
sumi-08 Memory: 175G/251G CPU Available: 97.40 VMs: 107

If you want to look at additional ways you can control certain aspects of the playbook execution, check out this Ansible documentation

Other Thoughts

After reviewing the error with a colleague of mine, I tried solving this problem in two other ways. First, I attempted to use Ansible’s syslogger module. While reading through the documentation, I found that it looks practically perfect, and I am sure that it would resolve the racing issue. However, the facility parameter appears to be the only place that I could switch what the file would be named and where it would be. And it is not open ended.

Due to the nature of my project, I needed to have a specially named file, in a specific location at the end of this playbook. Therefore, I ruled out the syslogger module as an alternative.

Another attempt I made was to use the shell module. This command actually worked as I had initially thought lineinfile would: local_action: "shell echo {{ stats }} >> ../current_stats.txt".

This is certainly a viable approach, and one that I am not personally against using on a local_action, as I have control over my own shell. However, when the documentation tells you to avoid it unless necessary, there are probably better ways of accomplishing the task:

If you want to execute a command securely and predictably, it may be better to use the command module instead. Best practices when writing playbooks will follow the trend of using command unless the shell module is explicitly required. 1

In Conclusion

I have decided to stick with using the throttle keyword to modify my task as it is a ‘purer’ Ansible solution, and does exactly what I need it to do.