Automating SSH Key Rotation on Hetzner Servers
Automating SSH Key Rotation on Hetzner Servers
Security audit said we should rotate SSH keys regularly. I checked when we last rotated keys on our Hetzner servers. Answer: never. Same key, 18 months. Not great.
So I wrote a script. Now we rotate all SSH keys across 8 servers in about 5 minutes instead of SSH-ing into each one by hand.
The manual process
Before I automated it, rotating keys looked like this:
- Generate new SSH keypair locally
- SSH into each server using old key
- Add new public key to
~/.ssh/authorized_keys - Test that new key works
- Remove old key from
authorized_keys - Update key in password manager
- Repeat for all 8 servers
Takes about 2 hours. And it's easy to mess up. If you remove the old key before checking the new one works, you lock yourself out. I've seen it happen.
The automated script
I created rotate-ssh-keys.sh:
#!/bin/bash
set -e
# List of servers
SERVERS="
server1.example.com
server2.example.com
server3.example.com
"
# Generate new keypair
KEY_NAME="hetzner-$(date +%Y%m%d)"
ssh-keygen -t ed25519 -f "$KEY_NAME" -N "" -C "automated-rotation-$(date +%Y%m%d)"
PUBLIC_KEY=$(cat "${KEY_NAME}.pub")
echo "Generated new key: $KEY_NAME"
echo "Public key: $PUBLIC_KEY"
# Upload new key to Azure Key Vault
az keyvault secret set \
--vault-name our-vault \
--name ssh-private-key \
--file "$KEY_NAME"
az keyvault secret set \
--vault-name our-vault \
--name ssh-public-key \
--value "$PUBLIC_KEY"
# Add new key to all servers
for server in $SERVERS; do
echo "Adding new key to $server..."
ssh -i ~/.ssh/old-key root@$server "
# Backup authorized_keys
cp ~/.ssh/authorized_keys ~/.ssh/authorized_keys.backup
# Add new key
echo '$PUBLIC_KEY' >> ~/.ssh/authorized_keys
# Remove duplicates
sort -u ~/.ssh/authorized_keys > ~/.ssh/authorized_keys.tmp
mv ~/.ssh/authorized_keys.tmp ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
"
# Test new key
echo "Testing new key on $server..."
ssh -i "$KEY_NAME" -o BatchMode=yes root@$server "echo 'New key works'" || {
echo "ERROR: New key doesn't work on $server!"
exit 1
}
echo "✓ New key verified on $server"
done
# Remove old keys from all servers
echo ""
echo "Removing old keys from all servers..."
for server in $SERVERS; do
echo "Removing old key from $server..."
ssh -i "$KEY_NAME" root@$server "
# Keep only the new key
echo '$PUBLIC_KEY' > ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
"
echo "✓ Old key removed from $server"
done
echo ""
echo "Key rotation complete!"
echo "New private key: $KEY_NAME"
echo "New public key: ${KEY_NAME}.pub"
echo "Keys also stored in Azure Key Vault"
Run it:
./rotate-ssh-keys.sh
Output:
Generated new key: hetzner-20251203
Public key: ssh-ed25519 AAAAC3...
Adding new key to server1.example.com...
Testing new key on server1.example.com...
✓ New key verified on server1.example.com
Adding new key to server2.example.com...
...
Key rotation complete!
Five minutes, done.
Why ed25519 keys
I used ssh-keygen -t ed25519 instead of RSA. A few reasons:
- Shorter keys (68 characters vs 500+ for RSA)
- Faster to generate and verify
- Better security per bit than RSA
- Resistant to timing attacks
RSA still works fine, but for new keys there's no reason not to use ed25519.
Storing keys in Azure Key Vault
The script uploads keys to Azure Key Vault. This means they're encrypted at rest, accessible to team members who have the right permissions, backed up automatically, and audited (Azure logs who accessed them and when).
To retrieve a key from Key Vault:
az keyvault secret download \
--vault-name our-vault \
--name ssh-private-key \
--file ~/.ssh/hetzner-key
chmod 600 ~/.ssh/hetzner-key
Testing before removing old keys
This is the part I cared most about. The script tests that the new key actually works before it removes the old one. Without this, you're one bad key away from a lockout.
ssh -i "$KEY_NAME" -o BatchMode=yes root@$server "echo 'New key works'" || {
echo "ERROR: New key doesn't work on $server!"
exit 1
}
-o BatchMode=yes prevents interactive prompts. If the key doesn't work, SSH fails immediately instead of asking for a password.
If the test fails, the script stops and the old key stays in place. Nothing breaks.
Avoiding lockouts
If something does go wrong during rotation, recovery options exist:
- Hetzner has KVM console access in their web panel
- You can log in via console and fix
authorized_keysmanually - The script also keeps backups:
authorized_keys.backup
I've never had to use any of these. But I sleep better knowing they're there.
Key rotation schedule
Security best practice says rotate every 90 days. We settled on every 6 months. Not perfect, but realistic.
I added it as a recurring calendar event. The script makes it painless enough that we actually follow through instead of pushing it off.
SSH config for multiple keys
After rotation, I updated ~/.ssh/config:
Host *.example.com
User root
IdentityFile ~/.ssh/hetzner-20251203
IdentitiesOnly yes
IdentitiesOnly yes stops SSH from trying every key in ~/.ssh/. Makes connections faster and avoids confusing authentication errors.
Alternative: Use Hetzner API
Hetzner has an API for managing SSH keys. You could automate rotation through it:
import requests
headers = {"Authorization": f"Bearer {HETZNER_API_TOKEN}"}
# Get current keys
response = requests.get("https://api.hetzner.cloud/v1/ssh_keys", headers=headers)
old_keys = response.json()["ssh_keys"]
# Add new key
new_key = {
"name": "automated-20251203",
"public_key": PUBLIC_KEY
}
response = requests.post("https://api.hetzner.cloud/v1/ssh_keys",
headers=headers, json=new_key)
# Remove old keys
for key in old_keys:
requests.delete(f"https://api.hetzner.cloud/v1/ssh_keys/{key['id']}",
headers=headers)
This only works for Hetzner Cloud though. We use dedicated servers, which don't support the API for SSH key management. A bit annoying, but that's why I went with the shell script approach.
Access control
Only the ops team can access SSH private keys in Azure Key Vault. I set up Azure AD group-based access:
az keyvault set-policy \
--name our-vault \
--object-id <ops-group-id> \
--secret-permissions get list
Developers don't need SSH access to the servers. If they need to debug something, they use Kubernetes exec:
kubectl exec -it pod-name -- /bin/bash
This works better than SSH because access is controlled by Kubernetes RBAC, actions show up in audit logs, and we don't need to manage SSH keys for every developer on the team.
Lessons
- Automate key rotation or it won't happen
- Test new keys before removing old ones
- Store keys in one central place (Key Vault), not on individual machines
- Use ed25519 for new keys
- Have a recovery plan (KVM console, backups)
Key rotation went from a painful 2-hour manual process we kept avoiding to a 5-minute script we actually run on schedule. Sometimes the best security improvement is just making the right thing easy to do.