Auto-scaling GitLab runners on Hetzner


I’ve been running my own GitLab instance for a while now. The instance itself is fairly huge, but maintaining it hasn’t been an issue at all, that’s my day job after all.

My main issue is the runners attached to the instance. Historically, I would just rent a few cheap VMs at BuyVM or such to have a few runners always available. Only issue: it ain’t cheap in the long run.

The solution: only spinning up servers when we actually need them. In the past, I tried using gitlab_hcloud, which ended up being really jank. Turns out Hetzner themselves have been building a fleeting plugin for the GitLab runner, so that’s what we’re going to take a look at today.

Setting it up is straightforward, just grab your runner token, and create a config file:

concurrent = 8

check_interval = 0

[[runners]]
  name = "x86-mini"
  url = "https://g.j4.lc"
  token = "very-secret"
  executor = "docker-autoscaler"

  [runners.docker]
    image = "busybox:latest"
    privileged = true

  [runners.cache]
    Type = "s3"
    Shared = true

    [runners.cache.s3]
      ServerAddress = "hel1.your-objectstorage.com"
      AccessKey = "very-secret"
      SecretKey = "very-secret"
      BucketName = "very-secret"
      Insecure = false

  [runners.autoscaler]
    plugin = "hetznercloud/fleeting-plugin-hetzner:latest"

    update_interval = "1m"
    update_interval_when_expecting = "5s"

    capacity_per_instance = 2
    max_instances = 2
    max_use_count = 0

    instance_ready_command = "cloud-init status --wait || test $? -eq 2"

    [runners.autoscaler.plugin_config]
      name = "hetzner-docker-autoscaler"
      token = "very-secret"

      location = "fsn1"
      server_type = "cpx22"
      image = "debian-12"
      private_networks = []

      user_data = """#cloud-config
      package_update: true
      package_upgrade: true

      apt:
        sources:
          docker.list:
            source: deb [signed-by=$KEY_FILE] https://download.docker.com/linux/debian $RELEASE stable
            keyid: 9DC858229FC7DD38854AE2D88D81803C0EBFCD88

      packages:
        - ca-certificates
        - docker-ce

      swap:
        filename: /var/swap.bin
        size: auto
        maxsize: 4294967296 # 4GB
      """

    [runners.autoscaler.connector_config]
      use_external_addr = true
      use_static_credentials = true
      username = "root"
      key_path = "/ssh/id_ed25519"

    [[runners.autoscaler.policy]]
      periods = ["* * * * *"]
      timezone = "Europe/Berlin"
      idle_count = 0
      idle_time = "50m"
TOML

Then using the following docker-compose configuration:

  gitlab-runners-fleeting:
    image: gitlab/gitlab-runner:latest
    volumes:
      - ./fleeting/config:/etc/gitlab-runner
      - ./fleeting/base:/plugins
      - ./fleeting/ssh:/ssh
    environment:
      - FLEETING_PLUGIN_PATH=/plugins
YAML

Now, we need to do a few steps like:

  • Generate a SSH key for the runners.
  • Actually download the Hetzner plugin.
  • Put the config.toml in the fleeting/config directory.

This is fairly straightforward:

~$ mkdir -p fleeting/ssh
~$ cd fleeting/ssh
~/fleeting/ssh$ ssh-keygen
Generating public/private ed25519 key pair.
Enter file in which to save the key (/root/.ssh/id_ed25519): /path/to/fleeting/ssh
# rest of generation
~/fleeting/ssh$ cd
~$ docker compose run gitlab-runners-fleeting fleeting install
Zsh

The fleeting install command will download the Hetzner plugin and put it in the /plugins dir, avoiding having to download it on each container re-creation.

Now just bringing the container up and looking at the Hetzner panel should create a new machine and run the workflow.

After waiting around an hour, you’ll see that the machine was cleaned automatically, nice.


I encountered a few issues with this one still. My instance needs ARM64 runners for certain workflows and merging runner configs seems to have some issues.

I tried adding multiple [[runners]] entries to the config file of a single one, with its own machine definition and runner/Hetzner tokens, but it seems it will still spin a cpx22 (x86) instead of a cax11 (ARM) when an ARM job is triggered.

Currently, I am running two containers with each their own config file, one for x86 and one for ARM, which isn’t really practical.

If anybody knows how to avoid that issue, please tell me how in the comments.


Jae's Blog
Jae's Blog
@b@b.j4.lc

Jae’s blog, now federating properly!

132 posts
42 followers
Fediverse Reactions

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

To respond on your own website, enter the URL of your response which should contain a link to this post’s permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post’s URL again. (Find out more about Webmentions.)