Cloning A Private Github Repo Via SSH Onto A Server That Is Being Provisioned By Ansible

June 30, 2022 Updated: November 30, 2023 8 minute read

Background/Problem

Note: Usually I try to keep my blog posts short, but this one became a little longer as usual. Sorry for that.

Recently I started to use Ansible [1]. One of the first things I worked on, was a playbook to set up my own laptops/computers. I just like having them been configured all the same. Anyway, pretty quickly I stumbled upon a task that took me quite long to figure out how to make it work. Here is what I wanted to do:

Let’s assume that I have two laptops, and let’s call them Tom and Jerry in order to have names to refere to them. Jerry is my current working machine that is properly set up. Tom is a new laptop that I want to set up/configure using Ansible.

I keep a lot of my dotfiles in a public repository on Github. It was pretty easy to clone that one to Tom with a task in an Ansible playbook. But I also have some private repos with configuration files and other stuff that I needed on Tom. Cloning them in a safe way turned out to be a little complicated.

Since I didn’t find any blog post, any answer on stackoverflow or anywhere similar, or any other single source of knowledge that explained everything I needed, I thought, it would be a good idea to summarize the solution I came up with. Mainly for my future self, but also for everyone who has the same problem.

A little more about my requirements:

Ansible should connect from Jerry to Tom via SSH using public key authentication (no password). Nothing special here.
I don’t want to set the accept_hostkey option in the corresponding Ansible task (git module) to true. This would just circumvent one of the security feature of SSH. Even the documentation of Ansible’s build in git module states that.
Tom has to use SSH and public key authentication to clone repos from Github. Why? Well, because we are talking about private repositories.
I don’t want to copy my SSH key that I use on Jerry to access Github to Tom. This might not be a security issue as long as Tom and Jerry are both my local laptops. But later on, I might want to use the same playbook to provision servers in the cloud, and I just don’t want my Github key floating around the web.

Just in case you missed it: There are two different SSH keys involved in this:

One SSH key is needed to connect from Jerry to Tom. The public part of that key has to be on Tom; the private part of the key stays on Jerry apparently.
The second SSH key is needed to clone Git-repos that are on Github to Tom. The public part of that key, has to be on Github; the private part would normally be on the computer that clones a repo, but in this case it should stay on Jerry and not be copied to Tom.

Solution

Concerning requirement one (“Ansible connects from Jerry to Tom via SSH using public key authentication (no password)”):

Somehow Tom has to accept the SSH key of the user that connects from Jerry to Tom. Therefore the public key of that user has to be listed in the authorized_keys file on Tom. This file is usually found in the home directory of the user account you want to use while connecting to a machine (/home/<USER>/.ssh/authorized_keys).

As long as you have physical access to Tom, it’s kind of easy to archive that. Just copy the public key onto a USB thumb drive, mount that drive on Tom, and copy the key. If you don’t have physical access, I actually don’t know how to move the key from Jerry to Tom without connecting to Tom via SSH but with password authentication at least once. As long as SSH on Tom is configured to accept password authentication, you can for example use scp to copy the key from Jerry to Tom, afterwards connect to Tom (still using password authentication) and configure SSH to not except password authentication any more, by putting the following line into the sshd_config file (usually found under /etc/ssh/):

PasswordAuthentication no

Don’t forget to restart the SSH service afterwards.

I really don’t know how to avoid having a short period of time where a newly created machine somewhere in the cloud accepts password authentication. If you do, please let me know.

Requirement number two (“I don’t want to set the accept_hostkey option in the corresponding Ansible task to true.”): So, what’s the problem here?

Well, the very first time you connect to any new host via SSH, you will be informed (at least if you use OpenSSH), that you are trying to connect to a new host, and you’re asked if you trust the other side, that they are who they claim to be. So for example: Assume you had never connected to Github, if you type the following command:

ssh -T git@github.com

You will get this output (or similar):

The authenticity of host 'github.com (140.82.121.3)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/zLDA0zPMSvHdkr4UvCOqU.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])?

You should not just type “yes” and hit enter at this point, but actually verify that the given fingerprint is correct. More about why you should do that and how you could go about doing it, can be found here [2].

If you set the accept_hostkey option in an Ansible task using the git module, it would just happily accept any fingerprint it is presented with, no matter if the fingerprint came from Github or some malicious actor.

After you accepted a fingerprint and therefore the SSH key of the host you want to connect to, the public key of that host will be stored in the known_hosts file that is usually in your home directory under /home/<USER>/.ssh/. In all further connection attempts to that same host, the public key of that host will be checked against the saved key in the known_hosts file.

So back to my situation: Githubs public key has to be put into the know_hosts file in my home directory on Tom. Well, that’s actually not such a critical task. Just ssh into Tom and copy the known_hosts file from Jerry to Tom (or append everything needed into the known_hosts file that might already exist on Tom). And since the known_hosts file does not contain any secrets, I’m thinking about putting such a file with the public keys of public services like Github and Gitlab into a Git-Repository that I can just clone onto new machines that I set up. If you find any public repository with public keys, you should not blindly use it, you should check if the keys are correct yourself. But setting up something like that for myself, should not introduce any risk.

So far everything was pretty “standard”. Nothing I hadn’t done before I started to work with Ansible. But the next two requirements are the once that took me some time to figure out. When running an Ansible playbook from Jerry against the host Tom, Tom must use public key authentication when cloning a Github repository, and I don’t want to copy the key that I use on Jerry to clone private Github repositories to Tom.

Well, of course I could just ssh into Tom, generate a new SSH key pair, and add the newly created public key to my Github profile. But, this involves a lot of manual steps and I surely don’t want to do that for potentially many machine that I might set up in the future. That means Tom has to be able to use my key from Jerry without the key being copied to Tom. The solution to this problem is the ssh-agent. In order to make this work the following has to be done:

Install ssh-agent on both machines (Tom and Jerry).
Add the following two lines to the /home/<USER>/.ssh/config file on Jerry:

Host IP_OF_TOM
    ForwardAgent yes

Add the following line to the file /etc/ssh/sshd_config on Tom and make sure the ssh service is restarted afterwards:

AllowAgentForwarding yes

To make sure the ssh-agent is started on logging in into a machine, add the following line to the .bash_profile file (or equivalent files, if you don’t use .bash_profile) on both machines (Tom and Jerry):

if ! pgrep -u "$USER" ssh-agent > /dev/null; then
    ssh-agent -t 1h > "$XDG_RUNTIME_DIR/ssh-agent.env"
fi
if [[ ! "$SSH_AUTH_SOCK" ]]; then
    source "$XDG_RUNTIME_DIR/ssh-agent.env" >/dev/null
fi

It might happen that you have to configure Ansible to use the forwarding. To do so, I usually add the following lines to a file called ansible.cfg in the project directory that I currently work on:

[ssh_connection]
ssh_args = -o ForwardAgent=yes

Now before you run the Ansible playbook on Jerry against Tom, on Jerry add the SSH key that you want to use to clone Github repos to the ssh-agent:

ssh-add /path/to/ssh-key

And finally, if you run the playbook, the ssh-agent should make sure that Tom is able to use your key from Jerry to clone a private repository.

Changelog

2022-09-25:

Changed wording a little bit.
Changed formatting slightly.

2023-11-30:

Add paragraph about the lines in ansible.cfg. This was missing an caused me a lot of time lost in hunting a buggy playbook. Sorry, if that has caused you some headaches as well.
Correct some minor errors (grammar).

Take care,
Andreas

References

Red Hat, Inc., “Ansible is Simple IT Automation.” [Online]. Available at: https://www.ansible.com/. [Accessed: 30-Jun-2022].
The Qubes OS Project and others, “Verifying signatures | Qubes OS.” [Online]. Available at: https://www.qubes-os.org/security/verifying-signatures/. [Accessed: 06-Aug-2022].

Share on

LinkedIn Twitter Facebook

Andreas Schuster