Previously, we needed to hardcode the URL for private SSH keys,
this is cleaned up and we can iterate over each project for its
configuration.
Configuration is at deployment time.
Signed-off-by: Raito Bezarius <raito@lix.systems>
It was relying on GitHub stuff which we don't have and is not an option
we want to support.
If we wanted to do it, we would rather use S3 directly.
Signed-off-by: Raito Bezarius <raito@lix.systems>
This removes the need for a proxy and rely on the `groups` property of
the `userDetails` passed at the authentication layer.
To add a certain role, add the group `buildbot-$role` to that user via
Keycloak.
Signed-off-by: Raito Bezarius <raito@lix.systems>
using `--option` like this hid that the silent timeout was never
actually set, instead we set the unknown and thus ignored option
`--max-silent-time`. while we're at it we can also set a timeout
for the entire build, chosen as two hours because that should be
enough for all current jobs (and hopefully it'll stay that way).
it's off by default and thus not representative of user flake setup, we
don't use it anyway, and it's a security risk to boot. there is no good
reason to enable this in any setting that is not perfectly trusted, and
even there it is not such a great idea due to the impurity it requires.
This is a cursed option that is free root for anyone who puts hacks into
flake.nix. We don't actually use `nixConfig` in Lix, so we can just
delete this thing.
Fixes: #11
We package a quite old plugin for Buildbot: https://github.com/claws/buildbot-prometheus
Ideally, we should probably vendor it and maintain it ourselves.
There seems to be no protection against the metrics endpoint for
Buildbot, this is not a big deal given that the CI is public.
Signed-off-by: Raito Bezarius <raito@lix.systems>
and an additional set for generic tasks like error reporting. this
prevents hol blocking for underutilized arches when at least one arch is
blocking, as usually happens to us with aarch64-linux.
the react-based ui is too slow for our needs, janky, the log viewer
doesn't work quite right (breaking after ~600 lines of logs viewed),
loses updates to sub-builds, and just blanks its entire screen when a
build finishes. the old ui doesn't do that.
previously we immediately triggered all jobs, now we no longer do.
showing the total count at least somewhere is nice to have a rough
indication of how much longer a build may still need to run.
without this two builds can interfere with each other if:
- builds 1 and 2 start
- build 1 is starved of workers
- build 2 finishes, removes the shared gcroots directory
- gc runs
- build 1 schedules more builds whose .drvs have now been removed
using a dedicated directory for each build fixes this.
we now also need to set alwaysRun on the cleanup command or we risk
littering the system with stale gc roots when a build fails.
run all of them on the normal build worker. this significantly
simplifies the overall scheduler/builder config and removes a
triplication of possible builds paths.
retries don't help us very much, in fact they mostly hurt by repeating
builds that failed for non-transient reasons. retries could help with
workers dropping while running a build, but those rare cases are better
to restart manually than to pend at least twice the ci time for commits
that simply do not build cleanly.