Commit graph

130 commits

Author SHA1 Message Date
raito 2411875825 fix(sshkey): PosixPath does not play well with Buildbot APIs
It expected a `str`, not a `PosixPath`.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-05 21:21:26 +02:00
raito fa83000f07 feat(debug): add manhole debugging
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-05 20:52:52 +02:00
raito 98c5d82bf8 chore(dataclass): use default_factory
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:13:01 -07:00
raito ea5e2c6b98 chore(builders): localize builders specification like Hydra does
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:13:01 -07:00
raito 235ff9b138 chore(entrypoint): hydraJobs → buildbotJobs
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:13:01 -07:00
raito 449837ed81 chore(reporters): make it 3.11+ (and 4.0) compatible!
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:13:01 -07:00
raito b20d0a17ba fix(gerrit): make buildbot able to read the priv ssh key
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:13:01 -07:00
raito bd8c11ed1e chore(origins): expose in a cuter way allowed origins
Worked around in our original deployment, here's a nicer way to set it.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:02:01 -07:00
raito 7102157055 chore(schedule): generalize source
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:02:01 -07:00
raito 2a1ed49ac8 chore(review-callback): generalize the event project name
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:02:01 -07:00
raito c1e7af1794 chore(nix-eval): generalize the builds_scheduler_group by project
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:02:01 -07:00
raito ec9834b0d3 chore(nix): make the target attribute a constant
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:02:01 -07:00
raito c09da505c1 chore(gerrit): put the gerrit configuration in one place and generate repo URLs templates
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:02:01 -07:00
raito 72b6757947 chore(canceller): generalize it to any project
Just iterate over all project names.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:02:00 -07:00
raito d284a8bc77 chore(auth): generalize authentication method to internals of NixOS module
This makes it easier to make it configurable, this is step 1.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 14:01:31 -07:00
raito 16726a55bf chore(*): cleanup unused code
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 12:51:14 -07:00
raito b4ab40f746 chore(gerrit): offer projects configuration and factor out private SSH keys
Previously, we needed to hardcode the URL for private SSH keys,
this is cleaned up and we can iterate over each project for its
configuration.

Configuration is at deployment time.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 12:49:36 -07:00
raito 9eb92e76e7 chore(web): remove outputsPath option
It was relying on GitHub stuff which we don't have and is not an option
we want to support.

If we wanted to do it, we would rather use S3 directly.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 12:48:45 -07:00
raito 4fa460f563 chore(statuses): clarify why we don't use {start, summary}CB
Instead of just commenting them out.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-10-04 12:45:51 -07:00
raito 7875db31eb fix: disable autologin for OAuth 2
Otherwise, read-only access constantly gets redirected to our login
page.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-09-29 15:54:21 +02:00
raito f2d7f25f86 feat: enable Lix admins to admin the Buildbot properly
This removes the need for a proxy and rely on the `groups` property of
the `userDetails` passed at the authentication layer.

To add a certain role, add the group `buildbot-$role` to that user via
Keycloak.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-09-29 00:17:00 +02:00
eldritch horrors 45135d249b fix silent timeout, set build timeout
using `--option` like this hid that the silent timeout was never
actually set, instead we set the unknown and thus ignored option
`--max-silent-time`. while we're at it we can also set a timeout
for the entire build, chosen as two hours because that should be
enough for all current jobs (and hopefully it'll stay that way).
2024-05-26 16:26:25 +02:00
eldritch horrors 2a528f9e53 remove accept-flake-config from n-e-j invocation
it's off by default and thus not representative of user flake setup, we
don't use it anyway, and it's a security risk to boot. there is no good
reason to enable this in any setting that is not perfectly trusted, and
even there it is not such a great idea due to the impurity it requires.
2024-05-26 15:50:55 +02:00
raito e42966e193 Merge pull request 'feat: support Prometheus exports' (#7) from prometheus into main
Reviewed-on: #7
Reviewed-by: jade <jade@noreply.git.lix.systems>
2024-05-11 17:58:16 +00:00
jade d2ad4745c1 Remove --accept-flake-config
This is a cursed option that is free root for anyone who puts hacks into
flake.nix. We don't actually use `nixConfig` in Lix, so we can just
delete this thing.

Fixes: #11
2024-05-06 19:08:23 -07:00
raito 3876a30117 feat: support Prometheus exports
We package a quite old plugin for Buildbot: https://github.com/claws/buildbot-prometheus
Ideally, we should probably vendor it and maintain it ourselves.

There seems to be no protection against the metrics endpoint for
Buildbot, this is not a big deal given that the CI is public.

Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-05-06 14:26:32 +02:00
eldritch horrors 131fc792f7 allow worker counts to be set per arch 2024-04-05 15:13:11 +02:00
eldritch horrors daa84f4169 never build on the coordinator
for such cases just add the coordinator as a remote builder.
2024-04-05 14:12:15 +02:00
eldritch horrors 3717bfab04 automatically cancel outdated builds 2024-03-28 03:52:13 +01:00
puck 2eaee8f62b Fix marking jobs as successful if they never finish evaluating. 2024-03-18 00:07:34 +00:00
eldritch horrors d394f35f55 use one scheduler and worker set per arch
and an additional set for generic tasks like error reporting. this
prevents hol blocking for underutilized arches when at least one arch is
blocking, as usually happens to us with aarch64-linux.
2024-03-15 14:47:49 +01:00
eldritch horrors 5e50a858d7 revert to stable web ui
the react-based ui is too slow for our needs, janky, the log viewer
doesn't work quite right (breaking after ~600 lines of logs viewed),
loses updates to sub-builds, and just blanks its entire screen when a
build finishes. the old ui doesn't do that.
2024-03-15 14:40:23 +01:00
raito 8d36ac1d90 feat: signing key
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-03-12 01:27:46 +01:00
raito 6118daa0a4 feat: binary cache
Signed-off-by: Raito Bezarius <raito@lix.systems>
2024-03-12 01:27:46 +01:00
puck e9b3b38bbf Skip scheduling cached builds; improve reporter message 2024-03-11 15:05:15 +00:00
eldritch horrors 5cdef7efb6 fix status reporting to gerrit
also adjust labels from split verified to single verified, split labels
were only useful during the pre-ci hours
2024-03-11 14:44:09 +01:00
eldritch horrors 51f7b52149 pre-filter drv_info into all_deps
otherwise failure reporting is *enormous* with the entirety of a full
derivation info dump in there
2024-03-11 13:07:35 +01:00
eldritch horrors 13a67b483a fix interrupt()
can't interrupt with things to interrupt. this is technically duplicated
information but keeping parts of the code close to Trigger seems useful.
2024-03-11 13:05:12 +01:00
eldritch horrors 9933971ab0 re-enable the gerrit status reporter 2024-03-11 09:06:29 +01:00
eldritch horrors 29a2ef63e2 show hydra job count in trigger step
previously we immediately triggered all jobs, now we no longer do.
showing the total count at least somewhere is nice to have a rough
indication of how much longer a build may still need to run.
2024-03-11 09:05:28 +01:00
puck 9a15348984 Fix up a few loose ends 2024-03-11 08:08:55 +01:00
puck 4d73275123 Add build result tracking, schedule newly available builds 2024-03-11 08:08:53 +01:00
puck 28ca39af25 WIP: Replace Trigger with custom logic 2024-03-11 08:06:37 +01:00
eldritch horrors e9874c3d98 wip: dependency-tracked build triggering 2024-03-11 07:53:56 +01:00
eldritch horrors f869b52a8d use build-local gc-root directory
without this two builds can interfere with each other if:

  - builds 1 and 2 start
  - build 1 is starved of workers
  - build 2 finishes, removes the shared gcroots directory
  - gc runs
  - build 1 schedules more builds whose .drvs have now been removed

using a dedicated directory for each build fixes this.

we now also need to set alwaysRun on the cleanup command or we risk
littering the system with stale gc roots when a build fails.
2024-03-11 06:48:41 +01:00
eldritch horrors 156e6e3dea remove skipped-builds builder
run all of them on the normal build worker. this significantly
simplifies the overall scheduler/builder config and removes a
triplication of possible builds paths.
2024-03-11 06:27:32 +01:00
eldritch horrors 753df8e340 remove cachix
we aren't using it and it's somewhat in the way of our efforts to
improve scheduling and stuff.
2024-03-11 06:26:39 +01:00
eldritch horrors 0b2545b036 remove unused GitWithRetry 2024-03-11 06:26:39 +01:00
eldritch horrors fdfeef8ad4 remove retry logic
retries don't help us very much, in fact they mostly hurt by repeating
builds that failed for non-transient reasons. retries could help with
workers dropping while running a build, but those rare cases are better
to restart manually than to pend at least twice the ci time for commits
that simply do not build cleanly.
2024-03-11 06:26:38 +01:00
puck ec2ef903ab use .#hydraJobs rather than .#checks 2024-03-08 23:28:49 +00:00