fix(builders/netboot): make "normal" evaluation pass #111

Closed
janik wants to merge 1 commit from builder into main
Owner

Without this patch running colmena build will run into a few assertion
errors for machines that have config.bagel.baremetal.builders.netboot == true
set. This is due to an assertion check in the initrd module making sure
there is a mount point for /. This can be trivially fixed by just
setting the mount point to the real world value, which is a tmpfs with
64GB assigned.

We also set deployment.targetHost to a domain that will
never resolve in the public internet, to make sure nobody applies these
machines by hand. It would have been nice to throw a error whenever
colmena apply gets executed for one of these hosts, but doing so would
defeat the purpose of this patch, because the colmena build and apply
argument both evaluate the exact same code paths and thus colmena
build would error again.

The motivation behind this was, so we could run colmena build in CI
in the future, and to not scare of new contributors with random build
failures when they first try to build the machines.

The proper solution would be to exclude all the network booted builders
from the regular colmena hive that is exposed to the cli, but this is
too many yaks to shave for now.

Without this patch running `colmena build` will run into a few assertion errors for machines that have `config.bagel.baremetal.builders.netboot == true` set. This is due to an assertion check in the initrd module making sure there is a mount point for `/`. This can be trivially fixed by just setting the mount point to the real world value, which is a tmpfs with 64GB assigned. We also set `deployment.targetHost` to a domain that will never resolve in the public internet, to make sure nobody applies these machines by hand. It would have been nice to throw a error whenever `colmena apply` gets executed for one of these hosts, but doing so would defeat the purpose of this patch, because the colmena `build` and `apply` argument both evaluate the exact same code paths and thus colmena `build` would error again. The motivation behind this was, so we could run `colmena build` in CI in the future, and to not scare of new contributors with random build failures when they first try to build the machines. The proper solution would be to exclude all the network booted builders from the regular colmena hive that is exposed to the cli, but this is too many yaks to shave for now.
janik added 1 commit 2024-09-21 01:49:02 +00:00
Without this patch running `colmena build` will run into a few assertion
errors for machines that have `config.bagel.baremetal.builders.netboot == true`
set. This is due to an assertion check in the initrd module making sure
there is a mount point for `/`. This can be trivially fixed by just
setting the mount point to the real world value, which is a tmpfs with
64GB assigned.

We also set `deployment.targetHost` to a domain that will
never resolve in the public internet, to make sure nobody applies these
machines by hand. It would have been nice to throw a error whenever
`colmena apply` gets executed for one of these hosts, but doing so would
defeat the purpose of this patch, because the colmena `build` and `apply`
argument both evaluate the exact same code paths and thus colmena
`build` would error again.

The motivation behind this was, so we could run `colmena build` in CI
in the future, and to not scare of new contributors with random build
failures when they first try to build the machines.

The proper solution would be to exclude all the network booted builders
from the regular colmena hive that is exposed to the cli, but this is
too many yaks to shave for now.
janik requested review from yu-re-ka 2024-09-21 01:49:14 +00:00
Contributor

Thanks. This seems to address the error about fileSystems I was getting.

Thanks. This seems to address the error about `fileSystems` I was getting.
janik force-pushed builder from 9bab95bab7 to 464a726664 2024-09-23 21:14:32 +00:00 Compare
Owner

Alternative solutions would be to do colmena build --on @servers and have @builders be a separate tag?
Or we can adopt a special colmena fork and fix this upstream.

I have patches to make this sort of stuff work ~better.

Alternative solutions would be to do `colmena build --on @servers` and have `@builders` be a separate tag? Or we can adopt a special colmena fork and fix this upstream. I have patches to make this sort of stuff work ~better.
Owner

(I'm a bit on the fence regarding letting an invalid configuration exist at all, but I understand why it can be nice to fix this in an easier way.)

(I'm a bit on the fence regarding letting an invalid configuration exist at all, but I understand why it can be nice to fix this in an easier way.)
Author
Owner

Or we can adopt a special colmena fork and fix this upstream.
I have patches to make this sort of stuff work ~better.

Can you like those patches, I would be curious to read them. (I assume it's your custom activation stuff)

Alternative solutions would be to do colmena build --on @servers and have @builders be a separate tag?

That's less clear for people unfamiliar with the code base.

(I'm a bit on the fence regarding letting an invalid configuration exist at all, but I understand why it can be nice to fix this in an easier way.)

Understandable, but the problem already exists without this patch. The pr would merely alleviate some of the associated user pain.

> Or we can adopt a special colmena fork and fix this upstream. > I have patches to make this sort of stuff work ~better. Can you like those patches, I would be curious to read them. (I assume it's your custom activation stuff) > Alternative solutions would be to do colmena build --on @servers and have @builders be a separate tag? That's less clear for people unfamiliar with the code base. > (I'm a bit on the fence regarding letting an invalid configuration exist at all, but I understand why it can be nice to fix this in an easier way.) Understandable, but the problem already exists without this patch. The pr would merely alleviate some of the associated user pain.
janik force-pushed builder from 464a726664 to cef88ec598 2024-09-29 01:44:20 +00:00 Compare
Owner

Understandable, but the problem already exists without this patch. The pr would merely alleviate some of the associated user pain.

Right now, colmena building these builders will cause the NixOS module fail and make you unable to produce invalid artifacts.

Can you like those patches, I would be curious to read them. (I assume it's your custom activation stuff)

c959d643db
and there must be some others in another tree, I don't find right now.

That's less clear for people unfamiliar with the code base.

But fairly well documented in colmena and we can document it again.

> Understandable, but the problem already exists without this patch. The pr would merely alleviate some of the associated user pain. Right now, colmena building these builders will cause the NixOS module fail and make you unable to produce invalid artifacts. > Can you like those patches, I would be curious to read them. (I assume it's your custom activation stuff) https://git.dgnum.eu/DGNum/colmena/commit/c959d643dbfe7e932c359947f653b1eb49b2be35 and there must be some others in another tree, I don't find right now. > That's less clear for people unfamiliar with the code base. But fairly well documented in colmena and we can document it again.
Owner

Superseded by #124.

Superseded by #124.
raito closed this pull request 2024-10-06 08:10:09 +00:00

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: the-distro/infra#111
No description provided.