Avoiding duplicate entries in authorized_keys (ssh) in bash and ansible

Popular methods of adding an ssh public key to a remote host’s authorized_keys file include using the ssh-copy-id command, and using bash operators such as >> to append to the file.

An issue with ssh-copy-id is that this command does not check if a key already exists. This creates a hassle for scripts and automations because subsequent runs can add duplicate key entries. This command is also not bundled with MacOS, creating issues for some Mac users (though it can be installed with Homebrew).

This post covers a solution that adds a given key to authorized_keys only if that key isn’t already present in the file. Examples are provided in bash and for ansible using ansible’s shell module (old versions) and authorized_key module (newer versions).

For shell scripts, there seem to be a lot of solutions out there for this common problem, but I think a lot of them overcomplicate things with sed, awk, uniq, and similar commands; or go overboard by implementing standalone utilities for the task. One thing I don’t like about many of the working solutions that I’ve come across is when the authorized_keys file is reordered as a side-effect.

Note that ssh authentication works fine when there are multiple identical authorized_keys entries. However, accumulating junk in this file can create performance issues, and can make troubleshooting, auditing, and other admin tasks more difficult. When a remote host tries to authenticate, ssh works its way down the authorized_keys file until it comes across a match.

Adding a unique entry to authorized_keys

The following is a one-liner to be run by a user that can authenticate with the remote server.

Modify the snippet below to suit your needs:

ssh -T user@central.example.com "umask 0077 ; mkdir -p ~/.ssh ; grep -q -F \"$PUB_KEY\" ~/.ssh/authorized_keys 2>/dev/null || echo \"$PUB_KEY\" >> ~/.ssh/authorized_keys"

The command adds the public key stored in the shell variable $PUB_KEY to the authorized_keys file of the user on the server central.example.com. A umask ensures the correct file permissions.

To modify, replace user and central.example.com with values relevant to you, and either substitute your public key in place of the $PUB_KEY variable, or define the variable in a bash script or set it as an environment variable prior to executing the command.

Benefits of this approach:

  • unique entries: no duplicate authorized_keys
  • idempotent: subsequent runs given the same input will yield the same result
  • order preserved: entries in authorized_keys retain their order
  • correct permissions: in cases where the .ssh folder and/or authorized_keys file do not already exist, they will be created with the correct permissions for openssh thanks to the umask
  • quiet: the command is quiet
  • automation friendly: fast one-liner that’s easy to add to scripts, with minimized race conditions in situations that involve running automations (e.g. ansible playbooks) in parallel
  • KISS principle: its not as risky or difficult to configure as some other approaches that I have encountered online

Tip: If you want to suppress any motd/welcome banner content that might be outputted when connecting to the remote server via ssh, first touch a .hushlogin file in the target user’s home directory to suppress it.

Ansible implementation

Current: using the authorized_key module

The newer known_hosts module and authorized_key module (featuring numerous feature additions from its introduction through to 2.4+) were introduced to help manage ssh keys on a host.

The authorized_key module has a lot of useful options, including optional exclusivity, supporting sourcing keys from variables (and hence files via a lookup) as well as URL’s, and options to manage the authorized_keys/ folder (e.g. creating it with appropriate permissions if it doesn’t exist).

An example from the docs follows, with one addition: I added the exclusive option in keeping with the theme of this post.

- name: Set authorized key took from file
    user: charlie
    state: present
    key: "{{ lookup('file', '/home/charlie/.ssh/id_rsa.pub') }}"
    exclusive: yes

See the ansible documentation for more examples: http://docs.ansible.com/ansible/latest/authorized_key_module.html

Legacy: using bash in the shell module

One of the more annoying aspects of ansible can be getting escape characters right in templates and certain modules like shell, especially when variables are involved. The following example has valid syntax. You can modify the variables and the become and delegate_to args to suit your scenario:

# assume the ansible user on the control machine can access the remote target server via ssh 

- name: set_fact host_pub_key containing current host's pub key from local playbook_dir/keys
    host_pub_key: "{{ lookup('file', playbook_dir + '/keys/{{ inventory_hostname }}-{{ authorized_user }}-id_rsa.pub') }}"

- name: add current host's pub key to repo server's authorized_keys if its not already present 
  shell: |
    ssh -T {{ example_user }}@{{ example_server }} "umask 0077 ; mkdir -p ~/.ssh ; grep -q -F \"{{ host_pub_key }}\" ~/.ssh/authorized_keys 2>/dev/null || echo \"{{ host_pub_key }}\" >> ~/.ssh/authorized_keys"
    executable: /bin/bash
  become: "{{ ansible_user_id }}"
  delegate_to: localhost

The first task populates the host_pub_key fact from a hypothetical id_rsa.pub key file.

The second task executes the bash snippet that adds the public key to the remote host’s authorized_keys file in a way that avoids duplicates.

Stripping leading and trailing slashes from paths in Ansible

This post covers how to use ansible’s regex_replace filter to strip leading and/or trailing slashes from file paths and URL fragments.

The final example demonstrates how to generate a valid filename from a file or URL path by removing leading and trailing slashes, and replacing any remaining slashes with underscores. This could be useful for a variety of applications from backup scripts to web scraping.

Stripping slashes

Regexes and jinja2 expressions in ansible can be a pain in the ass, especially when it comes to escaping the right thing.

The key to the following examples is a double-escape of the forward slash character. They have been tested on ansible v.

Remove leading slashes

{{ variable_name | regex_replace('^\\/', '') }}

Remove trailing slashes

{{ variable_name | regex_replace('\\/$', '') }}

Stripping both leading and trailing slashes

This example makes use of the | (OR) to combine the previous two examples into one regex:

{{ variable_name | regex_replace('^\\/|\\/$', '') 

Here’s a quick debug task that demonstrates the above in action:

- debug:
    msg: "Look Ma! No leading or trailing slashes! -- {{ item | regex_replace('^\\/|\\/$', '') }}"
    - "/home/and/lots/of/folders/trailing_and_leading/"
    - "no_leading/but/there/is/trailing/"
    - "/leading/but/no_trailing"
    - "no_leading/and/no_trailing"

Creating a valid filename from a path

To create a valid filename from a path, we need to remove leading and trailing slashes, then replace any remaining slashes with underscores. This sort of thing can be useful for naming backup files, data obtained from URL scraping, etc.

This is accomplished by adding a second regex_replace to the previous example that replaces all slashes with underscores, e.g. regex_replace('\\/', '_').

The combined regex that creates a valid filename from given a path:

{{ variable_name | regex_replace('^\\/|\\/$', '') | regex_replace('\\/', '_') }}

If variable_name held a value like “/var/www” or “/var/www/” it would result in: “var_www”.

In case you want to replace the slashes in the path with a character other than an underscore, you can adjust the examples to use any valid filename character instead of a slash (e.g. the ‘^’ character).

Working around a regex_replace bug in the shell module

In my working version of ansible (2.3.x) the regex_replace filter is ignored (!) when it is applied to variables in a tasks using the shell module. The value of unfiltered variables are substituted into the shell command fine, and using any other filter works fine too.

Workaround: employ the set_fact module to build a new fact (variable) based on the original variable, applying the regex_replace filter here as required. Reference the new fact in the shell module to take advantage of the pre-filtered values.

In the following example, assume that the hypothetical {{ list_of_paths }} variable contains a list of strings containing file/dir/URL paths. The set_fact module builds the new {{ paths }} fact such that it contains a “pi” item corresponding to every item in the original list. Each of these items has a “stripped” property containing the filtered value and a “path” property containing the original unfiltered value.

- name: workaround example
    paths: "{{ paths|default([]) + [{'pi': {'path': item, 'stripped': item|regex_replace('^\\/|\\/$', '')|regex_replace('\\/', '_')}}] }}"
      with_items: "{{ list_of_paths }}"

When looping over {{ paths }} in a shell task (e.g. via with_items), the filtered slash-free values for items can be referenced via {{ item.stripped }}. The original unfiltered path can be referenced via {{ item.path }}.