Commit graph

4216 commits

Author SHA1 Message Date
Yureka
e4d466ffcd update lix 2024-10-05 23:32:45 +02:00
d3257e4761 Merge pull request 'Add metric for builds waiting for download slot' (#9) from waiting-metrics into main
Reviewed-on: #9
Reviewed-by: raito <raito@noreply.git.lix.systems>
2024-10-01 16:16:24 +00:00
f23ec71227 Add metric for builds waiting for download slot 2024-10-01 19:14:24 +03:00
ac37e44982 Merge pull request 'Update Lix; fix build' (#7) from ma27/hydra:lix-update into main
Reviewed-on: #7
Reviewed-by: leo60228 <leo@60228.dev>
2024-09-02 22:56:03 +00:00
6a88e647e7
flake.lock: Update; fix build
Flake lock file updates:

• Updated input 'lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=278fddc317cf0cf4d3602d0ec0f24d1dd281fadb' (2024-08-17)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=02eb07cfd539c34c080cb1baf042e5e780c1fcc2' (2024-09-01)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/c3d4ac725177c030b1e289015989da2ad9d56af0' (2024-08-15)
  → 'github:NixOS/nixpkgs/6e99f2a27d600612004fbd2c3282d614bfee6421' (2024-08-30)
2024-09-02 10:53:46 +02:00
8d5d4942e1
queue-runner: remove unused method from State 2024-08-27 02:57:37 +02:00
e5a8ee5c17
web: require permissions for /api/push 2024-08-27 02:57:16 +02:00
fd7fd0ad65
treewide: clang-tidy modernize 2024-08-27 01:33:12 +02:00
d3fcedbcf5
treewide: enable clang-tidy bugprone findings
Fix some trivial findings throughout the codebase, mostly making
implicit casts explicit.
2024-08-27 00:43:17 +02:00
3891ad77e3
queue-runner: change Machine object creation to work around clang bug
https://github.com/llvm/llvm-project/issues/106123
2024-08-26 22:34:48 +02:00
21fd1f8993
flake: add devShells, including a clang one for clang-tidy & more 2024-08-26 22:22:18 +02:00
ab6d81fad4
api: fix github webhook 2024-08-26 20:26:21 +02:00
Sandro
64df0cba47
Match URIs that don't end in .git
Co-authored-by: Charlotte <lotte@chir.rs>
2024-08-26 20:26:21 +02:00
Sandro Jäckel
6179b298cb
Add gitea push hook 2024-08-26 20:26:20 +02:00
44b9a7b95d
queue-runner: handle broken pg pool connections in builder code
Completes 9b62c52e5c with another location
that was initially missed.
2024-08-25 22:05:13 +02:00
3ee51dbe58 readIntoSocket: fix with store URIs containing an &
The third argument to `open()` in `-|` mode is passed to a shell if it's
a string. In my case the store URI contains
`?secret-key=${signingKey.directory}/secret&compression=zstd`

For the `nix store cat` case this means that

* until `&` the process will be started in the background. This fails
  immediately because no path to cat is specified.
* `compression=zstd` is a variable assignment
* the `$path` argument to `store cat` is attempted to be executed as
  another command

Passing just the list solves the problem.
2024-08-18 21:41:54 +00:00
e987f74954
doc: drop dev-notes & make update-dbix more discoverable
`dev-notes` are severely outdated. I dropped everything except one note
that I moved to hacking.md. The parts about creating users are also
covered elsewhere.

The `update-dbix` part got a just command to make it discoverable again.
2024-08-18 14:47:09 +02:00
1f802c008c
flake.lock: Update
Flake lock file updates:

• Updated input 'lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=5137cea99044d54337e439510a647743110b2d7d' (2024-08-10)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=278fddc317cf0cf4d3602d0ec0f24d1dd281fadb' (2024-08-17)
• Updated input 'nix-eval-jobs':
    'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=c057494450f2d1420726ddb0bab145a5ff4ddfdd' (2024-07-17)
  → 'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=42a160bce2fd9ffebc3809746bc80cc7208f9b08' (2024-08-13)
• Updated input 'nix-eval-jobs/flake-parts':
    'github:hercules-ci/flake-parts/9227223f6d922fee3c7b190b2cc238a99527bbb7' (2024-07-03)
  → 'github:hercules-ci/flake-parts/8471fe90ad337a8074e957b69ca4d0089218391d' (2024-08-01)
• Updated input 'nix-eval-jobs/treefmt-nix':
    'github:numtide/treefmt-nix/0fb28f237f83295b4dd05e342f333b447c097398' (2024-07-15)
  → 'github:numtide/treefmt-nix/349de7bc435bdff37785c2466f054ed1766173be' (2024-08-12)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/a781ff33ae258bbcfd4ed6e673860c3e923bf2cc' (2024-08-10)
  → 'github:NixOS/nixpkgs/c3d4ac725177c030b1e289015989da2ad9d56af0' (2024-08-15)
2024-08-18 14:18:36 +02:00
3a4e0d4917
Get dev environment working again
* justfile, inspired from Lix.
* let foreman use the stuff from outputs, similar to what Lix does>
* mess around with PERL5LIB[1] and PATH to get tests running locally.

[1] I don't really know how `Setup` was found before tbh.
2024-08-18 14:18:36 +02:00
3517acc5ba
Add direnv & PLS to the dev setup 2024-08-18 14:18:35 +02:00
459aa0a598 Stream files from store instead of buffering them
When an artifact is requested from hydra the output is first copied
from the nix store into memory and then sent as a response, delaying
the download and taking up significant amounts of memory.

As reported in https://github.com/NixOS/hydra/issues/1357

Instead of calling a command and blocking while reading in the entire
output, this adds read_into_socket(). the function takes a
command, starting a subprocess with that command, returning a file
descriptor attached to stdout.
This file descriptor is then by responsebuilder of Catalyst to steam
the output directly
2024-08-13 22:09:48 +02:00
f1b552ecbf update flake locks, fix compile errors 2024-08-12 22:45:34 +02:00
db8c2cc4a8
.github: remove
The primary host for this repo isn't github, and this is causing a whole
bunch of false indications in the Lix Forgejo UI as it tries to run the
.github/workflow with non-existing runners.
2024-08-11 16:15:12 +02:00
Rick van Schijndel
8858abb1a6
t/test.pl: increase event-timeout, set qvf
Some checks failed
Test / tests (push) Has been cancelled
Only log issues/failures when something's actually up.
It has irked me for a long time that so much output came
out of running the tests, this seems to silence it.
It does hide some warnings, but I think it makes the output
so much more readable that it's worth the tradeoff.

Helps for highly parallel running of jobs, sometimes they'd not give output for a while.
Setting this timeout higher appears to help.
Not completely sure if this is the right place to do it, but it works fine for me.
2024-08-11 16:08:35 +02:00
Rick van Schijndel
ef619eca99
t: increase timeouts for slow commands with high load
We've seen many fails on ofborg, at lot of them ultimately appear to come down to
a timeout being hit, resulting in something like this:

Failure executing slapadd -F /<path>/slap.d -b dc=example -l /<path>/load.ldif.

Hopefully this resolves it for most cases.
I've done some endurance testing and this helps a lot.
some other commands also regularly time-out with high load:

- hydra-init
- hydra-create-user
- nix-store --delete

This should address most issues with tests randomly failing.

Used the following script for endurance testing:

```

import os
import subprocess

run_counter = 0
fail_counter = 0

while True:
    try:
        run_counter += 1
        print(f"Starting run {run_counter}")
        env = os.environ
        env["YATH_JOB_COUNT"] = "20"
        result = subprocess.run(["perl", "t/test.pl"], env=env)
        if (result.returncode != 0):
            fail_counter += 1
        print(f"Finish run {run_counter}, total fail count: {fail_counter}")
    except KeyboardInterrupt:
        print(f"Finished {run_counter} runs with {fail_counter} fails")
        break
```

In case someone else wants to do it on their system :).
Note that YATH_JOB_COUNT may need to be changed loosely based on your
cores.
I only have 4 cores (8 threads), so for others higher numbers might
yield better results in hashing out unstable tests.
2024-08-11 16:08:09 +02:00
marius david
41dfa0e443
Document the default user and port in hacking.md
Some checks are pending
Test / tests (push) Waiting to run
2024-08-11 16:06:08 +02:00
4b107e6ff3
hydra-eval-jobset: pass --workers and --max-memory-size to n-e-j
Some checks failed
Test / tests (push) Has been cancelled
Lost in the h-e-j -> n-e-j migration, causing evaluation to always be
single threaded and limited to 4GiB RAM. Follow the config settings like
h-e-j used to do (via C++ code).
2024-07-22 23:16:29 +02:00
4b886d9c45
autotools -> meson
Some checks are pending
Test / tests (push) Waiting to run
There are some known regressions regarding local testing setups - since
everything was kinda half written with the expectation that build dir =
source dir (which should not be true anymore). But everything builds and
the test suite runs fine, after several hours spent debugging random
crashes in libpqxx with MALLOC_PERTURB_...
2024-07-22 22:30:41 +02:00
fbb894af4e
static: de-bundle vendored dependencies
The current way this whole build works is incompatible with having a
separate build dir, or at least with having a separate build dir. To be
improved in the future - maybe minimize the dependencies a bit. But this
isn't so much data that we really have to care.
2024-07-22 16:30:13 +02:00
Niklas Hambüchen
8a984efaef
renderInputDiff: Increase git hash length 8 -> 12
Some checks failed
Test / tests (push) Has been cancelled
See investigation on lengths required to be conflict-free in practice:

https://github.com/NixOS/hydra/pull/1258#issuecomment-1321891677
2024-07-21 12:23:29 +02:00
abc9f11417
queue runner: fix store URI args being written to the SSH hosts file
Some checks are pending
Test / tests (push) Waiting to run
2024-07-20 16:09:07 +02:00
9a4a5dd624
jobset-eval: fix actions not showing up sometimes for new jobs
Some checks are pending
Test / tests (push) Waiting to run
New jobs have their "new" status take precedence over them being
"failed" or "queued", which means actions that can act on "failed" or
"queued" jobs weren't shown to the user when they could only act on
"new" jobs.
2024-07-20 13:09:39 +02:00
ac406a9175
nixos-modules: hydra-queue-runner fix network-online.target eval warning
Some checks failed
Test / tests (pull_request) Has been cancelled
Test / tests (push) Has been cancelled
2024-07-19 09:13:32 +02:00
73616aa0d9
nixos-module: don't force Nix GC to keep outputs
Some checks failed
Test / tests (push) Has been cancelled
This isn't actually needed (h.n.o even overrides it!).

Fix the use of deprecated `gc-keep-derivations` alias along the way.
2024-07-17 13:21:58 +02:00
d33fc08341
nixos-module: fix trusted users
- Use extra-trusted-users to avoid overriding the default set of trusted
  users and causing permission issues.
- Add hydra and hydra-www users which also need permissions.
2024-07-17 13:20:37 +02:00
b0e9b4b2f9
hydra-eval-jobset: incrementally ingest eval results
Some checks failed
Test / tests (push) Has been cancelled
Test / tests (pull_request) Has been cancelled
nix-eval-jobs streams output, unlike hydra-eval-jobs. Now that we've
migrated, we can use this to:

1. Use less RAM by avoiding buffering a whole eval's worth of metadata
   into a Perl string and an array of JSON objects.
2. Make evals latency a bit lower by allowing the queue runner to start
   ingesting builds faster.
2024-07-17 12:05:41 +02:00
370a4bf138
treewide: start removing tests related to constituents
Some checks failed
Test / tests (push) Waiting to run
Test / tests (pull_request) Has been cancelled
The feature cannot easily be ported to nix-eval-jobs since it requires
deep integration into the evaluator, and h.n.o doesn't use it. Later
more of this will be ripped out.
2024-07-17 08:31:19 +02:00
ed7c58708c
hydra-eval-jobs: remove, replaced by nix-eval-jobs 2024-07-17 08:31:19 +02:00
6d4ccff43c
hydra-eval-jobset: use nix-eval-jobs instead of hydra-eval-jobs 2024-07-17 08:31:19 +02:00
684cc50d86
flake: add nix-eval-jobs as input 2024-07-17 08:17:32 +02:00
6195cec6a3
hydra-queue-runner: adjust for Lix generators related changes 2024-07-16 04:35:44 +02:00
1fbfed8162
flake: rename 'nix' input to 'lix'
For consistency with other Lix forks of Nix ecosystems projects, e.g.
nix-eval-jobs.
2024-07-16 03:59:38 +02:00
fb9e29d4d0
queue runner: fix nullptr deref on build exception after releasing a machine reservation
Some checks failed
Test / tests (push) Has been cancelled
2024-07-13 06:12:35 +02:00
05d620a54f
flake.lock: Update
Some checks are pending
Test / tests (push) Waiting to run
Flake lock file updates:

• Updated input 'nix':
    'git+https://git@git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=4c3d93611f2848c56ebc69c85f2b1e18001ed3c7' (2024-06-24)
  → 'git+https://git@git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=4b109ec1a8fc4550150f56f0f46f2f41d844bda8' (2024-07-11)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/e4509b3a560c87a8d4cb6f9992b8915abf9e36d8' (2024-06-23)
  → 'github:NixOS/nixpkgs/a046c1202e11b62cbede5385ba64908feb7bfac4' (2024-07-11)
2024-07-13 03:08:50 +02:00
a9a2679793
hydra-evaluator: fix regression from e9d0a3 (inverted assertion)
Some checks failed
Test / tests (push) Has been cancelled
2024-06-24 21:41:40 +02:00
e9d0a3a754
Update to latest Lix main
Some checks are pending
Test / tests (push) Waiting to run
2024-06-24 20:25:35 +02:00
cbe527a3ee
util.hh split
Some checks failed
Test / tests (push) Has been cancelled
2024-06-11 11:27:43 -04:00
ca98f42b39
nixexpr -> lixexpr 2024-06-11 11:13:42 -04:00
John Ericson
62bc5b54b2
Try again to ensure hydra module is usable
Some checks are pending
Test / tests (push) Waiting to run
Nixpkgs only contains a `hydra_unstable`, not `hydra`, package, so
adjust the default accordingly, and then override it to our package in
the separate module which does that.

(cherry picked from commit e149da7b9bbc04bd0b1ca03fa0768e958cbcd40e)
2024-06-10 17:40:02 +02:00
John Ericson
c98017b823
Factor out NixOS tests, and clean up
Due to newer nixpkgs, there were a number of things that could be
cleaned up in the process.

(cherry picked from commit 743795b2b090a5cdfe8bd90120add8db7770086a)
2024-06-10 17:40:02 +02:00