In NixOS, the user generation script was changed to set the permissions `0700`
to a home-directory that's specified in the `users.users`-submodule with
`createHome` being set to `true`[1].
However, the home-directory of `hydra` is also the base directory of other services using
other users (e.g. `hydra-queue-runner`). With permissions being `0700`, processes with
such a user cannot traverse into `/var/lib/hydra` and thus not into subdirectories.
I guess that this issue was kind of hidden because `hydra-init.service` ensures
proper permissions[2]. However, if `hydra-init.service` is not restarted on a
system-activation, the permissions of `/var/lib/hydra` will be set back to `0700`
by the activation script that runs on each activation.
This has lead to errors like this in `hydra-queue-runner` on my Hydra:
```
Sep 20 09:11:30 hydra hydra-queue-runner[306]: error (ignored): error: cannot unlink '/var/lib/hydra/build-logs/7h/dssz03gazrkqzfmlr5cprd0dvkg4db-squashfs.img.drv': Permission denied
Sep 20 09:11:30 hydra hydra-queue-runner[306]: error (ignored): error: cannot unlink '/var/lib/hydra/build-logs/b9/350vd8jpv1f86i312c9pkdcd2z56aw-squashfs.img.drv': Permission denied
Sep 20 09:11:30 hydra hydra-queue-runner[306]: error (ignored): error: cannot unlink '/var/lib/hydra/build-logs/kz/vlq4v9a1rylcp4fsqqav3lcjgskky4-squashfs.img.drv': Permission denied
Sep 20 09:11:30 hydra hydra-queue-runner[306]: error (ignored): error: cannot unlink '/var/lib/hydra/build-logs/xd/hkjnbbr9jp7364pkn8zpk9v8xapj2c-nix-2.4pre20210917_37cc50f.drv': Permission denied
Sep 20 09:11:30 hydra hydra-queue-runner[306]: error (ignored): error: cannot unlink '/var/lib/hydra/build-logs/zn/9df7225fl8p7iavqqfvlyay4rf0msw-nix-2.4pre20210917_37cc50f.drv': Permission denied
Sep 20 09:11:30 hydra hydra-queue-runner[306]: possibly transient failure building ‘/nix/store/7hdssz03gazrkqzfmlr5cprd0dvkg4db-squashfs.img.drv’ on ‘roflmayr’: error: creating directory '/var/lib/hydra/build-logs': Permission denied
Sep 20 09:11:30 hydra hydra-queue-runner[306]: will retry ‘/nix/store/7hdssz03gazrkqzfmlr5cprd0dvkg4db-squashfs.img.drv’ after 543s
```
Because of that, I decided to remove the `createHome = true;` setting and instead used
`systemd-tmpfiles`[3] which can not only ensure that certain directories
exist, but also proper permissions.
With this change, we can also get rid of the manual setup in
`hydra-init.service` since `systemd-tmpfiles` will be executed by
`switch-to-configuration` before *any* systemd service gets started. On
startup, `systemd-tmpfiles-setup.service` is invoked within
`sysinit.target` being reached, so when `hydra-init.service` gets called
in `multi-user.target`, the structure already exists.
[1] fa0d499dbf
[2] 3cec908738/hydra-module.nix (L260-L262)
[3] https://www.freedesktop.org/software/systemd/man/systemd-tmpfiles.html
Yet again, manual testing is proving to be insufficient. I'm pretty
sure I wrote this code but lost it in a rebase, or perhaps the switch
to result classes.
At any rate, this implements the actual "fetch a retry row and run it"
for the hydra-notify daemon.
Tested by hand.
Get the number of seconds before the next retriable task is ready.
This number is specifically intended to be used as a timeout, where
`undef` means never time out.
The test seems to be broken for a while[1]. The cause for this is that
in gitea 1.14 the `create-user` command got renamed to `user create`.
[1] https://hydra.nixos.org/build/151092299
This gives us a place to put helper functions that act on entire
tables, not just individual records.
This should be a backwards compatible change, except in places we're
manually using result class names.