Commit graph

4230 commits

Author SHA1 Message Date
eccf01d4fe Merge pull request 'flake: update nixpkgs to 24.11' (#15) from ma27/hydra:update-nixpkgs into main
Reviewed-on: lix-project/hydra#15
2024-12-06 16:37:25 +00:00
42c4a85ec8 Merge pull request 'README: document release branches' (#16) from ma27/hydra:readme-branches into main
Reviewed-on: lix-project/hydra#16
2024-12-06 16:34:52 +00:00
2b1c46730c
README: document release branches
Closes #12
2024-12-04 10:24:37 +01:00
51608e1ca1
flake: update nixpkgs to 24.11
I've been using the module on 24.11 and building Hydra from this repo
with 24.11 as well, so far it's looking good.

Making the upgrade since 24.05 is deprecated now.
2024-12-04 10:13:34 +01:00
6285440304
Merge pull request 'chore: apply lix include-rearrangement to hydra' (#14) from jade/include-rearrangement into main
Reviewed-on: lix-project/hydra#14
Reviewed-by: leo60228 <leo@60228.dev>
2024-11-29 16:23:14 -05:00
988554eb7a chore: apply lix include-rearrangement to hydra
This complies with the new include layout in Lix, which will eventually
replace the legacy one previously in use.
2024-11-25 23:08:32 -08:00
4acb1959a1 fixup: use the correct nix-eval-jobs input URL; same commit 2024-11-25 23:02:52 -08:00
453adb7f25 Merge pull request 'Update all flake inputs, fix build with latest Lix' (#13) from ma27/hydra:update-lix into main
Reviewed-on: lix-project/hydra#13
2024-11-26 06:50:41 +00:00
acd54bfbd6
Update all flake inputs, fix build with latest Lix
Flake lock file updates:

• Updated input 'lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=ed9b7f4f84fd60ad8618645cc1bae2d686ff0db6' (2024-10-05)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=66f6dbda32959dd5cf3a9aaba15af72d037ab7ff' (2024-11-20)
• Updated input 'lix/nix2container':
    'github:nlewo/nix2container/3853e5caf9ad24103b13aa6e0e8bcebb47649fe4' (2024-07-10)
  → 'github:nlewo/nix2container/fa6bb0a1159f55d071ba99331355955ae30b3401' (2024-08-30)
• Updated input 'lix/pre-commit-hooks':
    'github:cachix/git-hooks.nix/f451c19376071a90d8c58ab1a953c6e9840527fd' (2024-07-15)
  → 'github:cachix/git-hooks.nix/4e743a6920eab45e8ba0fbe49dc459f1423a4b74' (2024-09-19)
• Updated input 'nix-eval-jobs':
    'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=42a160bce2fd9ffebc3809746bc80cc7208f9b08' (2024-08-13)
  → 'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=912a9d63319e71ca131e16eea3348145a255db2e' (2024-11-18)
• Updated input 'nix-eval-jobs/flake-parts':
    'github:hercules-ci/flake-parts/8471fe90ad337a8074e957b69ca4d0089218391d' (2024-08-01)
  → 'github:hercules-ci/flake-parts/506278e768c2a08bec68eb62932193e341f55c90' (2024-11-01)
• Updated input 'nix-eval-jobs/nix-github-actions':
    'github:nix-community/nix-github-actions/622f829f5fe69310a866c8a6cd07e747c44ef820' (2024-07-04)
  → 'github:nix-community/nix-github-actions/e04df33f62cdcf93d73e9a04142464753a16db67' (2024-10-24)
• Updated input 'nix-eval-jobs/treefmt-nix':
    'github:numtide/treefmt-nix/349de7bc435bdff37785c2466f054ed1766173be' (2024-08-12)
  → 'github:numtide/treefmt-nix/746901bb8dba96d154b66492a29f5db0693dbfcc' (2024-10-30)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/ecbc1ca8ffd6aea8372ad16be9ebbb39889e55b6' (2024-10-06)
  → 'github:NixOS/nixpkgs/e8c38b73aeb218e27163376a2d617e61a2ad9b59' (2024-11-16)
2024-11-23 10:58:14 +01:00
a4b2b58e2b
update for lix header change 2024-11-17 19:57:36 -05:00
ee1234c15c ignoreException has been split into two
The Finally part is a destructor, so using `ignoreExceptionInDestructor`
seems to be the correct choice here.
2024-10-07 19:22:32 +02:00
7c7078cccf Fix build with latest Lix
Since ca1dc3f70bf98e2424b7b2666ee2180675b67451, the NAR parser has moved
the preallocate & receive steps into the file handle class to remove the
assumption that only one file can be handled at a time.
2024-10-07 19:22:32 +02:00
a5099d9e80 flake.lock: Update
Flake lock file updates:

• Updated input 'lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=02eb07cfd539c34c080cb1baf042e5e780c1fcc2' (2024-09-01)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=31954b51367737defbae87fba053b068897416fb' (2024-09-26)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/6e99f2a27d600612004fbd2c3282d614bfee6421' (2024-08-30)
  → 'github:NixOS/nixpkgs/759537f06e6999e141588ff1c9be7f3a5c060106' (2024-09-25)
2024-10-07 19:22:32 +02:00
Yureka
799441dcf6 Revert "update lix"
This reverts commit e4d466ffcd.
2024-10-06 13:55:10 +02:00
Yureka
e4d466ffcd update lix 2024-10-05 23:32:45 +02:00
d3257e4761 Merge pull request 'Add metric for builds waiting for download slot' (#9) from waiting-metrics into main
Reviewed-on: lix-project/hydra#9
Reviewed-by: raito <raito@noreply.git.lix.systems>
2024-10-01 16:16:24 +00:00
f23ec71227 Add metric for builds waiting for download slot 2024-10-01 19:14:24 +03:00
ac37e44982 Merge pull request 'Update Lix; fix build' (#7) from ma27/hydra:lix-update into main
Reviewed-on: lix-project/hydra#7
Reviewed-by: leo60228 <leo@60228.dev>
2024-09-02 22:56:03 +00:00
6a88e647e7
flake.lock: Update; fix build
Flake lock file updates:

• Updated input 'lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=278fddc317cf0cf4d3602d0ec0f24d1dd281fadb' (2024-08-17)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=02eb07cfd539c34c080cb1baf042e5e780c1fcc2' (2024-09-01)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/c3d4ac725177c030b1e289015989da2ad9d56af0' (2024-08-15)
  → 'github:NixOS/nixpkgs/6e99f2a27d600612004fbd2c3282d614bfee6421' (2024-08-30)
2024-09-02 10:53:46 +02:00
8d5d4942e1
queue-runner: remove unused method from State 2024-08-27 02:57:37 +02:00
e5a8ee5c17
web: require permissions for /api/push 2024-08-27 02:57:16 +02:00
fd7fd0ad65
treewide: clang-tidy modernize 2024-08-27 01:33:12 +02:00
d3fcedbcf5
treewide: enable clang-tidy bugprone findings
Fix some trivial findings throughout the codebase, mostly making
implicit casts explicit.
2024-08-27 00:43:17 +02:00
3891ad77e3
queue-runner: change Machine object creation to work around clang bug
https://github.com/llvm/llvm-project/issues/106123
2024-08-26 22:34:48 +02:00
21fd1f8993
flake: add devShells, including a clang one for clang-tidy & more 2024-08-26 22:22:18 +02:00
ab6d81fad4
api: fix github webhook 2024-08-26 20:26:21 +02:00
Sandro
64df0cba47
Match URIs that don't end in .git
Co-authored-by: Charlotte <lotte@chir.rs>
2024-08-26 20:26:21 +02:00
Sandro Jäckel
6179b298cb
Add gitea push hook 2024-08-26 20:26:20 +02:00
44b9a7b95d
queue-runner: handle broken pg pool connections in builder code
Completes 9b62c52e5c with another location
that was initially missed.
2024-08-25 22:05:13 +02:00
3ee51dbe58 readIntoSocket: fix with store URIs containing an &
The third argument to `open()` in `-|` mode is passed to a shell if it's
a string. In my case the store URI contains
`?secret-key=${signingKey.directory}/secret&compression=zstd`

For the `nix store cat` case this means that

* until `&` the process will be started in the background. This fails
  immediately because no path to cat is specified.
* `compression=zstd` is a variable assignment
* the `$path` argument to `store cat` is attempted to be executed as
  another command

Passing just the list solves the problem.
2024-08-18 21:41:54 +00:00
e987f74954
doc: drop dev-notes & make update-dbix more discoverable
`dev-notes` are severely outdated. I dropped everything except one note
that I moved to hacking.md. The parts about creating users are also
covered elsewhere.

The `update-dbix` part got a just command to make it discoverable again.
2024-08-18 14:47:09 +02:00
1f802c008c
flake.lock: Update
Flake lock file updates:

• Updated input 'lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=5137cea99044d54337e439510a647743110b2d7d' (2024-08-10)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=278fddc317cf0cf4d3602d0ec0f24d1dd281fadb' (2024-08-17)
• Updated input 'nix-eval-jobs':
    'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=c057494450f2d1420726ddb0bab145a5ff4ddfdd' (2024-07-17)
  → 'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=42a160bce2fd9ffebc3809746bc80cc7208f9b08' (2024-08-13)
• Updated input 'nix-eval-jobs/flake-parts':
    'github:hercules-ci/flake-parts/9227223f6d922fee3c7b190b2cc238a99527bbb7' (2024-07-03)
  → 'github:hercules-ci/flake-parts/8471fe90ad337a8074e957b69ca4d0089218391d' (2024-08-01)
• Updated input 'nix-eval-jobs/treefmt-nix':
    'github:numtide/treefmt-nix/0fb28f237f83295b4dd05e342f333b447c097398' (2024-07-15)
  → 'github:numtide/treefmt-nix/349de7bc435bdff37785c2466f054ed1766173be' (2024-08-12)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/a781ff33ae258bbcfd4ed6e673860c3e923bf2cc' (2024-08-10)
  → 'github:NixOS/nixpkgs/c3d4ac725177c030b1e289015989da2ad9d56af0' (2024-08-15)
2024-08-18 14:18:36 +02:00
3a4e0d4917
Get dev environment working again
* justfile, inspired from Lix.
* let foreman use the stuff from outputs, similar to what Lix does>
* mess around with PERL5LIB[1] and PATH to get tests running locally.

[1] I don't really know how `Setup` was found before tbh.
2024-08-18 14:18:36 +02:00
3517acc5ba
Add direnv & PLS to the dev setup 2024-08-18 14:18:35 +02:00
459aa0a598 Stream files from store instead of buffering them
When an artifact is requested from hydra the output is first copied
from the nix store into memory and then sent as a response, delaying
the download and taking up significant amounts of memory.

As reported in https://github.com/NixOS/hydra/issues/1357

Instead of calling a command and blocking while reading in the entire
output, this adds read_into_socket(). the function takes a
command, starting a subprocess with that command, returning a file
descriptor attached to stdout.
This file descriptor is then by responsebuilder of Catalyst to steam
the output directly
2024-08-13 22:09:48 +02:00
f1b552ecbf update flake locks, fix compile errors 2024-08-12 22:45:34 +02:00
db8c2cc4a8
.github: remove
The primary host for this repo isn't github, and this is causing a whole
bunch of false indications in the Lix Forgejo UI as it tries to run the
.github/workflow with non-existing runners.
2024-08-11 16:15:12 +02:00
Rick van Schijndel
8858abb1a6
t/test.pl: increase event-timeout, set qvf
Only log issues/failures when something's actually up.
It has irked me for a long time that so much output came
out of running the tests, this seems to silence it.
It does hide some warnings, but I think it makes the output
so much more readable that it's worth the tradeoff.

Helps for highly parallel running of jobs, sometimes they'd not give output for a while.
Setting this timeout higher appears to help.
Not completely sure if this is the right place to do it, but it works fine for me.
2024-08-11 16:08:35 +02:00
Rick van Schijndel
ef619eca99
t: increase timeouts for slow commands with high load
We've seen many fails on ofborg, at lot of them ultimately appear to come down to
a timeout being hit, resulting in something like this:

Failure executing slapadd -F /<path>/slap.d -b dc=example -l /<path>/load.ldif.

Hopefully this resolves it for most cases.
I've done some endurance testing and this helps a lot.
some other commands also regularly time-out with high load:

- hydra-init
- hydra-create-user
- nix-store --delete

This should address most issues with tests randomly failing.

Used the following script for endurance testing:

```

import os
import subprocess

run_counter = 0
fail_counter = 0

while True:
    try:
        run_counter += 1
        print(f"Starting run {run_counter}")
        env = os.environ
        env["YATH_JOB_COUNT"] = "20"
        result = subprocess.run(["perl", "t/test.pl"], env=env)
        if (result.returncode != 0):
            fail_counter += 1
        print(f"Finish run {run_counter}, total fail count: {fail_counter}")
    except KeyboardInterrupt:
        print(f"Finished {run_counter} runs with {fail_counter} fails")
        break
```

In case someone else wants to do it on their system :).
Note that YATH_JOB_COUNT may need to be changed loosely based on your
cores.
I only have 4 cores (8 threads), so for others higher numbers might
yield better results in hashing out unstable tests.
2024-08-11 16:08:09 +02:00
marius david
41dfa0e443
Document the default user and port in hacking.md 2024-08-11 16:06:08 +02:00
4b107e6ff3
hydra-eval-jobset: pass --workers and --max-memory-size to n-e-j
Lost in the h-e-j -> n-e-j migration, causing evaluation to always be
single threaded and limited to 4GiB RAM. Follow the config settings like
h-e-j used to do (via C++ code).
2024-07-22 23:16:29 +02:00
4b886d9c45
autotools -> meson
There are some known regressions regarding local testing setups - since
everything was kinda half written with the expectation that build dir =
source dir (which should not be true anymore). But everything builds and
the test suite runs fine, after several hours spent debugging random
crashes in libpqxx with MALLOC_PERTURB_...
2024-07-22 22:30:41 +02:00
fbb894af4e
static: de-bundle vendored dependencies
The current way this whole build works is incompatible with having a
separate build dir, or at least with having a separate build dir. To be
improved in the future - maybe minimize the dependencies a bit. But this
isn't so much data that we really have to care.
2024-07-22 16:30:13 +02:00
Niklas Hambüchen
8a984efaef
renderInputDiff: Increase git hash length 8 -> 12
See investigation on lengths required to be conflict-free in practice:

https://github.com/NixOS/hydra/pull/1258#issuecomment-1321891677
2024-07-21 12:23:29 +02:00
abc9f11417
queue runner: fix store URI args being written to the SSH hosts file 2024-07-20 16:09:07 +02:00
9a4a5dd624
jobset-eval: fix actions not showing up sometimes for new jobs
New jobs have their "new" status take precedence over them being
"failed" or "queued", which means actions that can act on "failed" or
"queued" jobs weren't shown to the user when they could only act on
"new" jobs.
2024-07-20 13:09:39 +02:00
ac406a9175
nixos-modules: hydra-queue-runner fix network-online.target eval warning 2024-07-19 09:13:32 +02:00
73616aa0d9
nixos-module: don't force Nix GC to keep outputs
This isn't actually needed (h.n.o even overrides it!).

Fix the use of deprecated `gc-keep-derivations` alias along the way.
2024-07-17 13:21:58 +02:00
d33fc08341
nixos-module: fix trusted users
- Use extra-trusted-users to avoid overriding the default set of trusted
  users and causing permission issues.
- Add hydra and hydra-www users which also need permissions.
2024-07-17 13:20:37 +02:00
b0e9b4b2f9
hydra-eval-jobset: incrementally ingest eval results
nix-eval-jobs streams output, unlike hydra-eval-jobs. Now that we've
migrated, we can use this to:

1. Use less RAM by avoiding buffering a whole eval's worth of metadata
   into a Perl string and an array of JSON objects.
2. Make evals latency a bit lower by allowing the queue runner to start
   ingesting builds faster.
2024-07-17 12:05:41 +02:00