feat: add automatic database dump to my S3 #270

Merged
RaitoBezarius merged 2 commits from dump-db into main 2024-10-08 11:18:28 +00:00
RaitoBezarius commented 2024-10-07 16:18:50 +00:00 (Migrated from github.com)

In another commit, I will add documentation and a S3 reverse proxy to read those dumps.

In another commit, I will add documentation and a S3 reverse proxy to read those dumps.
fricklerhandwerk (Migrated from github.com) reviewed 2024-10-08 05:51:50 +00:00
fricklerhandwerk (Migrated from github.com) left a comment

Why is this not a part of the service setup that runs after ingestions and evals complete? We'd need that for production anyway; it's specific to a deployment and the dump target could be configuration parameters; it's independent from the forge so I see no reason to run it from an action.

Why is this not a part of the service setup that runs after ingestions and evals complete? We'd need that for production anyway; it's specific to a deployment and the dump target could be configuration parameters; it's independent from the forge so I see no reason to run it from an action.
RaitoBezarius commented 2024-10-08 08:34:42 +00:00 (Migrated from github.com)

Why is this not a part of the service setup that runs after ingestions and evals complete? We'd need that for production anyway; it's specific to a deployment and the dump target could be configuration parameters; it's independent from the forge so I see no reason to run it from an action.

Well, I see things differently. A backup system or snapshot system is a property of the deployment because of the exact specifics of the targets, where do you put the directories, etc. I can move it inside the NixOS module, but this will just make it even harder to deploy for anyone else than me. This is the classical debate about how people tend to put too many opinions in a NixOS module and end up annoying external users who just want to override an entire phase.

To me, it doesn't really make sense to formalize something as trivial as a pg_dump | aws s3 cp inside the application deployment code.

In addition, we do not really have the signal "ingestion and evaluation are completed" and we should probably not tie backups to this. It's overengineering. If we want this, we will need PITR and the disk space consumption and backup script is completely different, I'm not sure if we are going to deliver any useful value by pursuing this avenue.

So, if you really want this to be moved in the NixOS module, I can, but this doesn't seem a high impact use of time.

(also other arguments: you usually put the backup service on another piece of infra and not on the same piece of infra, you get free monitoring via GH Actions failing if the schedule has a problem, etc.)

> Why is this not a part of the service setup that runs after ingestions and evals complete? We'd need that for production anyway; it's specific to a deployment and the dump target could be configuration parameters; it's independent from the forge so I see no reason to run it from an action. Well, I see things differently. A backup system or snapshot system is a property of the deployment because of the exact specifics of the targets, where do you put the directories, etc. I can move it inside the NixOS module, but this will just make it even harder to deploy for anyone else than me. This is the classical debate about how people tend to put too many opinions in a NixOS module and end up annoying external users who just want to override an entire phase. To me, it doesn't really make sense to formalize something as trivial as a `pg_dump | aws s3 cp` _inside_ the application deployment code. In addition, we do not really have the signal "ingestion and evaluation are completed" and we should probably not tie backups to this. It's overengineering. If we want this, we will need PITR and the disk space consumption and backup script is completely different, I'm not sure if we are going to deliver any useful value by pursuing this avenue. So, if you really want this to be moved in the NixOS module, I can, but this doesn't seem a high impact use of time. (also other arguments: you usually put the backup service on another piece of infra and not on the same piece of infra, you get free monitoring via GH Actions failing if the schedule has a problem, etc.)
fricklerhandwerk commented 2024-10-08 08:46:18 +00:00 (Migrated from github.com)

Makes sense generally, but what prevents us from having an executable-typed option for the backup service? Then you inject those deployment-level details into the system from the POV of the deployment. That script you wrote could be exactly the value you supply in staging/configuration.nix, except the scheduling happens in the service itself. And I thought the ingestion and eval runs merely complete? Why can't this be the signal? Doesn't have to be built now, but at least we'd have the option. Stuffing things into GitHub doesn't give you that option.

I'm not sure I buy the free monitoring argument, since we need (and decided to build) proper monitoring anyway.

Makes sense generally, but what prevents us from having an executable-typed option for the backup service? Then you inject those deployment-level details into the system from the POV of the deployment. That script you wrote could be exactly the value you supply in `staging/configuration.nix`, except the scheduling happens in the service itself. And I thought the ingestion and eval runs merely complete? Why can't this be the signal? Doesn't have to be built now, but at least we'd have the option. Stuffing things into GitHub doesn't give you that option. I'm not sure I buy the free monitoring argument, since we need (and decided to build) proper monitoring anyway.
RaitoBezarius commented 2024-10-08 09:17:48 +00:00 (Migrated from github.com)

Makes sense generally, but what prevents us from having an executable-typed option for the backup service? Then you inject those deployment-level details into the system from the POV of the deployment. That script you wrote could be exactly the value you supply in staging/configuration.nix, except the scheduling happens in the service itself. And I thought the ingestion and eval runs merely complete? Why can't this be the signal? Doesn't have to be built now, but at least we'd have the option. Stuffing things into GitHub doesn't give you that option.

You mean like systemd.services.web-security-tracker-delta.postStart = ''$do a backup''; ?
But like, this is already possible and a choice by the ops people, I assume that folks have already an understanding of how NixOS works (and systemd) and how operations work in general.

I can explain how to do these things, but this is really teaching things that are supposed to be known at this point.

This knob doesn't need to be explicitly encapsulated in the NixOS module because we don't know what are the set of possibilities, it's unbounded. So instead of providing a non-overrideable knob (or almost useless one), we can just let people use lower level NixOS modules to directly do their orchestration as they see fit.

Stuffing things into GitHub changed nothing about this option which already existed since day 1.

Anyone can decide to write:

systemd.services.web-security-tracker-delta.postStart = ''
  ${./staging/dump-database.sh}
'';

modulo very slight changes, in their own deployment. I didn't remove this possibility.

I'm not sure I buy the free monitoring argument, since we need (and decided to build) proper monitoring anyway.

Well, for the official infra, yes. For staging, I may share a dashboard but I won't spend all my time building all bespoke dashboards for something that could have been handled by GHA.

> Makes sense generally, but what prevents us from having an executable-typed option for the backup service? Then you inject those deployment-level details into the system from the POV of the deployment. That script you wrote could be exactly the value you supply in `staging/configuration.nix`, except the scheduling happens in the service itself. And I thought the ingestion and eval runs merely complete? Why can't this be the signal? Doesn't have to be built now, but at least we'd have the option. Stuffing things into GitHub doesn't give you that option. You mean like `systemd.services.web-security-tracker-delta.postStart = ''$do a backup'';` ? But like, this is already possible and a choice by the ops people, I assume that folks have already an understanding of how NixOS works (and systemd) and how operations work in general. I can explain how to do these things, but this is really teaching things that are supposed to be known at this point. This knob doesn't need to be explicitly encapsulated in the NixOS module because we don't know what are the set of possibilities, it's unbounded. So instead of providing a non-overrideable knob (or almost useless one), we can just let people use lower level NixOS modules to directly do their orchestration as they see fit. Stuffing things into GitHub changed nothing about this option which already existed since day 1. Anyone can decide to write: ```nix systemd.services.web-security-tracker-delta.postStart = '' ${./staging/dump-database.sh} ''; ``` modulo very slight changes, in their own deployment. I didn't remove this possibility. > > I'm not sure I buy the free monitoring argument, since we need (and decided to build) proper monitoring anyway. Well, for the official infra, yes. For staging, I may share a dashboard but I won't spend all my time building all bespoke dashboards for something that could have been handled by GHA.
fricklerhandwerk commented 2024-10-08 10:20:15 +00:00 (Migrated from github.com)

Alright, thanks once again for the explanation. Dropping the hint on using postStart in a comment in a relevant place would help picking this up on the ops side. We can't expect everyone to know every single detail in advance.

Alright, thanks once again for the explanation. Dropping the hint on using `postStart` in a comment in a relevant place would help picking this up on the ops side. We can't expect everyone to know every single detail in advance.
fricklerhandwerk (Migrated from github.com) approved these changes 2024-10-08 10:20:22 +00:00
erictapen commented 2024-10-14 10:57:32 +00:00 (Migrated from github.com)

I'm still not sure how to access the dump. Do I need to know the name of the file?
https://dumps.sectracker.nixpkgs.lahfa.xyz/ gives Not found

I'm still not sure how to access the dump. Do I need to know the name of the file? https://dumps.sectracker.nixpkgs.lahfa.xyz/ gives Not found
RaitoBezarius commented 2024-10-14 12:50:54 +00:00 (Migrated from github.com)

This is fixed, @erictapen mea culpa, I fucked up the vhost.

This is fixed, @erictapen mea culpa, I fucked up the vhost.
Sign in to join this conversation.
No description provided.