Restore Backed Up Files Using Ansible

Author: Sean Barbour

We all agree that backing up data is critical, but how often do you actually restore your critical data? At our company, the answer is at least once every week. We completely demolish all our systems just for sport, which means restoring critical data is necessary to put it all back. Let’s take a look at an Ansible role that will restore data quickly and reliably. To use this role, you need only modify the data source and destination. If you are new to Ansible, then this is a great project to begin building DevOps skills.

Rebuilding a replacement server from scratch and restoring the data excercises the cycle of backup AND restore. We routinely do this each week as part of a complete cloud rebuild. This effort is more than just a fire drill as our cloud’s efficiency improves every week, not to mention our devops skills.

So let me take you through the thought process I followed to move a multitude of different file types from backup to production.

My Process

There is a three step process I use with much of my work, especially if that work is relatively new to me and hasn’t been done before. Straight from “Turn the Ship Around”, by L. David Marquet:

  1. What do I intend to accomplish? (I intend to)
  2. What am I doing this for? (I am doing this because)
  3. What is the end result I expect to see? (I expect ____ to happen)

In answering those questions, I provide myself with a good base of the start-to-finish of this process. Now it may seem a bit overboard for such a straightforward task as this, but it keeps me using the same process all the time, which is important when switching over to a high priority or a dangerous task.

To give a bit more depth, let me provide the answers I usually keep in my head. Also, there is nothing wrong with writing this information down until you get a good handle on the process. I keep a note attached to my desk with this very theology.

  1. What do I intend to accomplish? (I intend to) I intend to put my locally backed up data onto a server that I just rebuilt.

  2. What am I doing this for? (I am doing this because) I am doing this because I want to be able to quickly push data from a backup to a new server if our ingress server dies for some reason. Doing so would eliminate many issues a class may have if a server goes down.

  3. What is the end result I expect to see? (I expect ____ to happen) I expect all static content to be pushed and available to each of our ingress servers.

I knew I was going to use Ansible for this task because I’m very comfortable with it, and it’s been the golden hammer here at Alta3 Research for years. The next task was to determine if I was using shell commands or if Ansible had a module that I could use instead. Enter google. My search included “ansible copy” because that’s what I intended to use.

The “Ansible copy” keywords brought me to the first hit which was the Ansible copy module documentation. Perfect! In viewing this doc, make sure you read the entire document…and then read it again! You will be surprised how much you learn or understand differently the second read around. The copy module allows you to copy a file from a machine (local or remote) to another machine (most likely remote). In our situation, we will be copying from local to remote. Let’s go over the parameter options–why I used some and why I negated others. I have emboldened the parameters I chose to use.

Ansible Copy Module Parameter Options

If you are in a TL;DR hurry, then skip this section and reference it later when reading my code below.

  • attributes
    These are the attributes that the ending file/dir will have. We will explicitly set these with other parameters so we won’t use this one.

  • backup
    Not going to use this because the method we are going forward with will actually have no files or missing files on the endpoint. The default behavior is “no” for this parameter.

  • checksum
    We won’t use this option either. The content we are moving I’ve always felt does not warrant the need for that kind of validation.

  • content
    This parameter is also unecessary for our context. We will instead use the “src” and get the entire contents of the directory.

  • decrypt
    No need to use Ansible Vault for the current moment. This might be useful moving forward as it gets deployed with other Ansible playbooks.

  • dest
    This is a required piece of the copy module and tells ansible where the file(s) should be placed on the remote server.

  • directory_mode
    Because we are copying multiple depths of directories and files, we will use this option. We will also use group and owner, and mode options so that this parameter does not take system defaults when copying to the new remote server.

  • follow
    Since we are copying static content, we do not have a need for the follow option at this point. If you were copying Ansible Playbooks with many roles, there might be a reason to follow the links you would likely have.

  • force
    This option is actually a very important one to decide on. I will not be using this because the assumption is that these files are going to a new server and there will be no need to overwrite or not overwrite. The default for this is to overwrite so one must be very careful.

  • group
    There are a few options with copy module that I almost always use. One of them is “group”. I like to explicitly designate ownership and permissions because I find that helpful in directory structures.

  • local_follow
    We aren’t going to use this here, as it’s going to follow links like the “follow” parameter - which we do not need in this situation.

  • mode
    Just like the group parameter, we want to explicitly set the permissions so we don’t have to worry about access on the remote end after the file transfer.

  • owner
    As I mentioned in the mode and group sections, I want to explicitly designate ownership so that I know I have the right permissions set on my files.

  • remote_src
    By default, this module will look at the server which is running the Ansible Playbook (localhost) and try to find the dir/files that need to be copied. Therefore we will not need to specify remote_src to false. This is a useful parameter if you are copying from one remote server to another. If we were to add this into the task, it would be nice if we were moving data from one of our backup servers to the new server via a locally running playbook on one of our jump hosts.

  • selevel, serole, setype, seuser
    We don’t have to worry about the SELinux options here. Security Enchanced Linux is an added linux kernel security module which uses the default denial and then permissive granularity is added for each application/element on the system in order to prevent unwanted access to the system. Our system is set to allow these specific actions without the use of the se group of parameters.

  • src
    This is the location of the file(s)/dir(s) that we want to copy. Obviously a must have for this module. We’re going to use an absolute local path for this. You could use variables here if the path is too long or if you are using it multiple times.

  • unsafe_writes
    Again, this would not be a good use-case for this project because we are doing atomical operations. If using Docker with mounted files, which actually only allows files inside the container to be written unsafely. Therefore you would have to specifically call this parameter and key in “yes” for it in that example.

  • validate
    Because we are pushing static files, with with pretty much a blank slate, we aren’t going to worry about validating any files on the remote server. If we had a running nginx, or other access control file like sudoers. This way, the file could get a validity test (making sure the configuration works) then copies the file to the remote host.

The Implemented Code

After going through the 21 parameters and choosing the correct ones for this project, here is the initial code which will be put into a role and able to be included with many other playbooks if need-be.

---
- name: copy 
  hosts: prod-nginx
  gather_facts: true
  tasks: 
  copy:
    src: /var/a3-backups/prod-nginx
	dest: /var/www/html/static.alta3.com
	mode: '0755'
	group: ubuntu
	owner: ubuntu
	directory_mode: true
	

Another piece of the puzzle I used after this task was a two step process to verify that a file from the local machine actually copied over to the remote machine (and this is assuming you don’t completely trust Ansible’s idempotent methodology). Here we run the stat module where we check that the path contains the inspection.jpg file and registers that result to a variable and then run debug against it with a message “The file exists on the remote machine now” when the stat module returns that the file exists.

- name:
    stat:
      path: /var/www/html/static.alta3.com/blog/inspection.jpg 
    register: does_file_exist

- debug:
    msg: "The file exists on the remote machine now"
  when: does_file_exist.stat.exists

Checkout the following links for more info on using conditionals in Ansible:
- https://docs.ansible.com/ansible/latest/user_guide/playbooks_conditionals.html
- https://riptutorial.com/ansible/example/12269/when-condition

Because this is a role that will be using variables in defaults/main.yml, I first created the task without variables like shown above. The next step was to push it into a roles directory and replace full paths with default variables. Here is what the role now looks like with a vars file.

---
- name: copy static files out to servers
  copy:
    src: "{{ src_dir }}"
    dest: "{{ dest_dir }}"
    mode: '0755'
    group: upload
    owner: upload
    directory_mode: yes
  become: true

- name: check if file exists
  stat:
    path: "{{ dest_dir }}/{{ test_file }}"
  register: does_file_exist


- debug:
    msg: "The file exists on the remote machine now"
  when: does_file_exist.stat.exists

And this is what my defaults/main.yml file looks like:

---
src_dir: /var/a3-backups/prod-nginx
dest_dir: /var/www/html/static.alta3.com
test_file: blog/inspection.jpg

Testing it Out

The next item on the list is to test that role. I’ve found it’s easiest to have a tests directory with a simple playbook to test particular roles, before putting them in a full-fledged playbook. Here is the playbook below.

---
- name: test role
  hosts: prod-nginx
  gather_facts: True
  vars_files:
  roles:
    - role: content-restore

A simple way to test this role was just changing the variables to be an easily removable test file. Roles do not care about hosts so we took that out of the actual role and added it to our test playbook. Then we created a temporary hosts file that included:

[defaults]
prod-nginx

The end command used to test was ansible-playbook -i hosts test.yml -vv. Here are the results:

seaneon@tower2:~/git/alta3-ansible/undercloud_playbooks/tower2$ ansible-playbook -i hosts test.yml -vv
ansible-playbook 2.9.6
  config file = /home/seaneon/git/alta3-ansible/undercloud_playbooks/tower2/ansible.cfg
  configured module search path = ['/home/seaneon/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.6/dist-packages/ansible
  executable location = /usr/local/bin/ansible-playbook
  python version = 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0]
Using /home/seaneon/git/alta3-ansible/undercloud_playbooks/tower2/ansible.cfg as config file

PLAYBOOK: test.yml **************************************************************************************************************************************************************************************************************
1 plays in test.yml

PLAY [test role] ****************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************
task path: /home/seaneon/git/alta3-ansible/undercloud_playbooks/tower2/test.yml:2
ok: [prod-nginx]
META: ran handlers

TASK [content-restore : copy static files out to servers] ***********************************************************************************************************************************************************************
task path: /home/seaneon/git/alta3-ansible/undercloud_playbooks/tower2/roles/content-restore/tasks/main.yml:2
changed: [prod-nginx] => {"changed": true, "checksum": "9801739daae44ec5293d4e1f53d3f4d2d426d91c", "dest": "/var/www/html/static.alta3.com/blog/testfile06172020.txt", "gid": 1001, "group": "upload", "md5sum": "eb1a3227cdc3fedbaec2fe38bf6c044a", "mode": "0755", "owner": "upload", "size": 8, "src": "/home/ubuntu/.ansible/tmp/ansible-tmp-1592434300.8994968-255537019536423/source", "state": "file", "uid": 1001}

TASK [content-restore : check if file exists] ***********************************************************************************************************************************************************************************
task path: /home/seaneon/git/alta3-ansible/undercloud_playbooks/tower2/roles/content-restore/tasks/main.yml:12
ok: [prod-nginx] => {"changed": false, "stat": {"atime": 1592434301.871255, "attr_flags": "e", "attributes": ["extents"], "block_size": 4096, "blocks": 8, "charset": "us-ascii", "checksum": "9801739daae44ec5293d4e1f53d3f4d2d426d91c", "ctime": 1592434301.8792548, "dev": 2049, "device_type": 0, "executable": true, "exists": true, "gid": 1001, "gr_name": "upload", "inode": 2583572, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mimetype": "text/plain", "mode": "0755", "mtime": 1592434301.3912628, "nlink": 1, "path": "/var/www/html/static.alta3.com/blog/testfile06172020.txt", "pw_name": "upload", "readable": true, "rgrp": true, "roth": true, "rusr": true, "size": 8, "uid": 1001, "version": "829138049", "wgrp": false, "woth": false, "writeable": false, "wusr": true, "xgrp": true, "xoth": true, "xusr": true}}

TASK [content-restore : debug] **************************************************************************************************************************************************************************************************
task path: /home/seaneon/git/alta3-ansible/undercloud_playbooks/tower2/roles/content-restore/tasks/main.yml:18
ok: [prod-nginx] => {
    "msg": "The file exists on the remote machine now"
}
META: ran handlers
META: ran handlers

PLAY RECAP **********************************************************************************************************************************************************************************************************************
prod-nginx                 : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

After cleaning up my structure, the role worked on a test playbook. All that needs done now is attaching that role to the appropriate playbook.

This is very useful for an initial push of data to a destination. If you are moving potentially updated data to a destination that already may contain that data, it would be wise to use the synchronize module as you don’t overwrite data currently located on the destination. So who knows, I might just create another Ansible role that mirrors what I ran here and replace it with synchronize and its appropriate parameters.