feat(monitoring): add uptime-kuma for status page, see #97 #118

Merged
raito merged 1 commit from kiaragrouwstra/infra:feature-service-uptime-kuma into main 2024-10-01 16:13:27 +00:00
Contributor

Adds a config for a status page using uptime-kuma.
Open questions here included:

  • what machine to run this on
    (and if a new one how to configure their network bits);
  • who could help set the secret in the age file;
  • who could set up the application password (currently a manual step in
    services.uptime-kuma), after which the stateless client can be re-built;
  • what to monitor -- i for now commented some sub-domains i could not
    publicly access to test.
Adds a config for a status page using uptime-kuma. Open questions here included: - what machine to run this on (and if a new one how to configure their network bits); - who could help set the secret in the age file; - who could set up the application password (currently a manual step in services.uptime-kuma), after which the stateless client can be re-built; - what to monitor -- i for now commented some sub-domains i could not publicly access to test.
kiaragrouwstra added 1 commit 2024-09-27 06:51:52 +00:00
Adds a config for a status page using uptime-kuma.
Open questions here included:
- what machine to run this on
(and if a new one how to configure their network bits);
- who could help set the secret in the age file;
- who could set up the application password (currently a manual step in
services.uptime-kuma), after which the stateless client can be re-built;
- what to monitor -- i for now commented some sub-domains i could not
publicly access to test.
janik requested changes 2024-09-28 17:47:47 +00:00
janik left a comment
Owner

what machine to run this on

I think we should add this to public01.

who could help set the secret in the age file;

That would be @raito afaik.
(btw we should document the secret management)

who could set up the application password (currently a manual step in services.uptime-kuma), after which the stateless client can be re-built;

Isn't the passwordFile set using agenix in line 82 or does someone have to ssh onto the host and run some uptime-kuma management command? (I'm unfamiliar with uptime-kuma)

what to monitor -- i for now commented some sub-domains i could not publicly access to test.

Anything with a public facing web interface, which should basically be everything.
You might not be able to access domains such as fodwatch or news at the time because we (try to) shut down the bare metal servers if the services aren't actively in use.

> what machine to run this on I think we should add this to public01. > who could help set the secret in the age file; That would be @raito afaik. (btw we should document the secret management) > who could set up the application password (currently a manual step in services.uptime-kuma), after which the stateless client can be re-built; Isn't the passwordFile set using agenix in line 82 or does someone have to ssh onto the host and run some uptime-kuma management command? (I'm unfamiliar with uptime-kuma) > what to monitor -- i for now commented some sub-domains i could not publicly access to test. Anything with a public facing web interface, which should basically be everything. You might not be able to access domains such as fodwatch or news at the time because we (try to) shut down the bare metal servers if the services aren't actively in use.
@ -0,0 +5,4 @@
...
}:
let
subdomains = [
Owner

It would be preferable to pull the domain information from the terraform/dnsimple.nix, so we avoid maintaining the list of domains twice.

It would be preferable to pull the domain information from the `terraform/dnsimple.nix`, so we avoid maintaining the list of domains twice.
Owner

This should probably be injected via specialArgs I assume to make it available.

This should probably be injected via `specialArgs` I assume to make it available.
Author
Contributor

hm. i agree, tho i'm not entirely sure yet what the best approach would here.
if the file were imported (edit: / injected) i'm under the impression it would go thru a mkIf check.
maybe the data-y bits could be e.g. factored out into a separate file to reuse from both uptime-kuma (if not also from gandi which has lotsa overlap with dnsimple)?
feedback welcome. 😶

hm. i agree, tho i'm not entirely sure yet what the best approach would here. if the file were `import`ed (edit: / injected) i'm under the impression it would go thru a `mkIf` check. maybe the data-y bits could be e.g. factored out into a separate file to reuse from both `uptime-kuma` (if not also from `gandi` which has lotsa overlap with `dnsimple`)? feedback welcome. 😶
Owner

Let's keep it simple for now, we can do it in a further PR. It's unclear how to extract the data-y bits, maybe Terranix could expose a data-only module we could re-import in the whole expr, unclear to me yet.

Let's keep it simple for now, we can do it in a further PR. It's unclear how to extract the data-y bits, maybe Terranix could expose a data-only module we could re-import in the whole expr, unclear to me yet.
@ -0,0 +26,4 @@
# "news"
];
host = "status.forkos.org";
Owner

Please add the domain to terraform ^^

Please add the domain to terraform ^^
@ -0,0 +29,4 @@
host = "status.forkos.org";
port = 3001;
in
{
Owner

Please guard the config behind a mkIf with an enable option like options.bagel.status.enable = mkEnableOption "status page";. Currently, applying this change would enable this service on every host.

Please guard the config behind a mkIf with an enable option like `options.bagel.status.enable = mkEnableOption "status page";`. Currently, applying this change would enable this service on every host.
raito marked this conversation as resolved
@ -0,0 +31,4 @@
in
{
imports = [ "${inputs.stateless-uptime-kuma}/nixos/module.nix" ];
nixpkgs.overlays = [ (import "${inputs.stateless-uptime-kuma}/overlay.nix") ];
Owner

We maintain all the input related overlays directly in the flake.nix.

We maintain all the input related overlays directly in the flake.nix.
Owner

Yeah, all modules and overlays should be applied uniformly to all machines.

Yeah, all modules and overlays should be applied uniformly to all machines.
raito marked this conversation as resolved
@ -0,0 +35,4 @@
services.uptime-kuma.enable = true;
services.nginx = {
Owner

The nginx stuff is fine for now, but we should generalize proxying a bit in the future. (having every service add their own firewall rules, and nginx config is redundant and error prone)

The nginx stuff is fine for now, but we should generalize proxying a bit in the future. (having every service add their own firewall rules, and nginx config is redundant and error prone)
raito marked this conversation as resolved
Owner

@kiaragrouwstra I'm available to help you get this PR to the finishline, if you need my help on anything.

@kiaragrouwstra I'm available to help you get this PR to the finishline, if you need my help on anything.
kiaragrouwstra added 1 commit 2024-09-28 20:15:09 +00:00
kiaragrouwstra added 1 commit 2024-09-28 20:18:22 +00:00
kiaragrouwstra added 1 commit 2024-09-28 20:20:57 +00:00
kiaragrouwstra added 1 commit 2024-09-28 20:22:43 +00:00
kiaragrouwstra added 1 commit 2024-09-28 20:25:16 +00:00
Author
Contributor

i pushed updates to address some of the feedback now.

who could set up the application password (currently a manual step in services.uptime-kuma), after which the stateless client can be re-built;

Isn't the passwordFile set using agenix in line 82 or does someone have to ssh onto the host and run some uptime-kuma management command? (I'm unfamiliar with uptime-kuma)

i thought so too.
unfortunately, as it turns out, the credential popped in there is essentially for a client thing to access the actual service so as to populate it declaratively from nix.
as such, it seems that the service itself at present still needs to manually be assigned those seemed credentials imperatively right now.

i pushed updates to address some of the feedback now. > > who could set up the application password (currently a manual step in services.uptime-kuma), after which the stateless client can be re-built; > > Isn't the passwordFile set using agenix in line 82 or does someone have to ssh onto the host and run some uptime-kuma management command? (I'm unfamiliar with uptime-kuma) i thought so too. unfortunately, as it turns out, the credential popped in there is essentially for a client thing to access the actual service so as to populate it declaratively from nix. as such, it seems that the service itself at present still needs to manually be assigned those seemed credentials imperatively right now.
Owner

nitpicks: please adopt https://www.conventionalcommits.org/en/v1.0.0/ for your commit messages (if it's too pesky, please consider do it for the next PR).

nitpicks: please adopt https://www.conventionalcommits.org/en/v1.0.0/ for your commit messages (if it's too pesky, please consider do it for the next PR).
Owner

I pushed 7de60d03a65df3a73cedf8c9fe8009413dd7ee53 on main to give you the secret, feel free to rebase to get it.

I pushed 7de60d03a65df3a73cedf8c9fe8009413dd7ee53 on main to give you the secret, feel free to rebase to get it.
Owner

Deployed on https://status.forkos.org/dashboard, seems to work.

Deployed on https://status.forkos.org/dashboard, seems to work.
Author
Contributor

@raito Thanks, let me check out the commit convention.

Earlier I wrote the commits (esp. follow-up ones) under the presumption I'd rebase them out, tho now I see Forgejo won't take a push force - perhaps relevant for the rebase to main as well.

If I might ask, is there a way to still fix the commit messages on this PR for me then still?

(Further, now the deployment worked, had you managed to create the account with the desired credentials there as well? I might not know the credential to do that part myself, but at the deployed page I'm currently just seeing the option to log in rather than create account, so if you hadn't I might need to look into that still.)

@raito Thanks, let me check out the commit convention. Earlier I wrote the commits (esp. follow-up ones) under the presumption I'd rebase them out, tho now I see Forgejo won't take a push force - perhaps relevant for the rebase to main as well. If I might ask, is there a way to still fix the commit messages on this PR for me then still? (Further, now the deployment worked, had you managed to create the account with the desired credentials there as well? I might not know the credential to do that part myself, but at the deployed page I'm currently just seeing the option to log in rather than create account, so if you hadn't I might need to look into that still.)
Owner

(Further, now the deployment worked, had you managed to create the account with the desired credentials there as well? I might not know the credential to do that part myself, but at the deployed page I'm currently just seeing the option to log in rather than create account, so if you hadn't I might need to look into that still.)

Yep, I did it. It's first account becomes an admin, so I just provisioned it imperatively. I pinged the author of the stateless module and this should probably be fixed upstream with some INITIAL_ADMIN_USER, INITIAL_ADMIN_PASSWORD convergence option.

Earlier I wrote the commits (esp. follow-up ones) under the presumption I'd rebase them out, tho now I see Forgejo won't take a push force - perhaps relevant for the rebase to main as well.

I'm surprised, this should work out of the box (?).

For the time being, if it's too much a hassle, we can rebase everything and I can rename the commit. Let me know what you prefer.

> (Further, now the deployment worked, had you managed to create the account with the desired credentials there as well? I might not know the credential to do that part myself, but at the deployed page I'm currently just seeing the option to log in rather than create account, so if you hadn't I might need to look into that still.) Yep, I did it. It's first account becomes an admin, so I just provisioned it imperatively. I pinged the author of the stateless module and this should probably be fixed upstream with some INITIAL_ADMIN_USER, INITIAL_ADMIN_PASSWORD convergence option. > Earlier I wrote the commits (esp. follow-up ones) under the presumption I'd rebase them out, tho now I see Forgejo won't take a push force - perhaps relevant for the rebase to main as well. I'm surprised, this should work out of the box (?). For the time being, if it's too much a hassle, we can rebase everything and I can rename the commit. Let me know what you prefer.
kiaragrouwstra force-pushed feature-service-uptime-kuma from 660122477f to 65a4e417eb 2024-09-30 17:28:34 +00:00 Compare
kiaragrouwstra force-pushed feature-service-uptime-kuma from 65a4e417eb to df7ad30882 2024-09-30 17:31:31 +00:00 Compare
Author
Contributor

@raito apologies, I managed to squash it now. let me know if I should still amend something.

@raito apologies, I managed to squash it now. let me know if I should still amend something.
raito merged commit b291caac46 into main 2024-10-01 16:13:27 +00:00
Owner

Thank you @kiaragrouwstra !

Thank you @kiaragrouwstra !
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: the-distro/infra#118
No description provided.