Commit graph

391 commits

Author SHA1 Message Date
raito c969625b0f fix(sniproxy): outside/inside of infra, the ingress IPs are different
In my infrastructure, the source node is 99::1, outside of my infra,
it's ::1.

All of this machinery was never really meant to be used on this scale,
so oopsie.

We should build our own sniproxy at some point.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-30 19:01:44 +02:00
raito 1b22c1f0ae fix(hydra): proxy it over my sniproxy
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-30 18:34:35 +02:00
Ilya K 30d759edf4 terraform/hydra: switch k900-experiments jobset to less-nixpkgses branch 2024-08-30 19:22:09 +03:00
Pierre Bourdon cd92c9588f
flake.lock: Update
Flake lock file updates:

• Updated input 'hydra':
    'git+https://git.lix.systems/lix-project/hydra.git?ref=refs/heads/main&rev=f1b552ecbf2d011cd4fdb93d7d117388ab9c0027' (2024-08-12)
  → 'git+https://git.lix.systems/lix-project/hydra.git?ref=refs/heads/main&rev=44b9a7b95d23e7a8587cb963f00382046707f2db' (2024-08-25)
• Updated input 'hydra/lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=5137cea99044d54337e439510a647743110b2d7d' (2024-08-10)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=278fddc317cf0cf4d3602d0ec0f24d1dd281fadb' (2024-08-17)
• Updated input 'hydra/nix-eval-jobs':
    'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=c057494450f2d1420726ddb0bab145a5ff4ddfdd' (2024-07-17)
  → 'git+https://git.lix.systems/lix-project/nix-eval-jobs?ref=refs/heads/main&rev=42a160bce2fd9ffebc3809746bc80cc7208f9b08' (2024-08-13)
• Updated input 'hydra/nix-eval-jobs/flake-parts':
    'github:hercules-ci/flake-parts/9227223f6d922fee3c7b190b2cc238a99527bbb7' (2024-07-03)
  → 'github:hercules-ci/flake-parts/8471fe90ad337a8074e957b69ca4d0089218391d' (2024-08-01)
• Updated input 'hydra/nix-eval-jobs/treefmt-nix':
    'github:numtide/treefmt-nix/0fb28f237f83295b4dd05e342f333b447c097398' (2024-07-15)
  → 'github:numtide/treefmt-nix/349de7bc435bdff37785c2466f054ed1766173be' (2024-08-12)
2024-08-25 22:07:24 +02:00
raito 024b431cbc feat(grafana): plug jsonnet-based dashboards in provisioning
Add the gerrit dashboards as an example.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-24 16:32:21 +02:00
raito d1ffce9336 feat(grafana): jsonnet-based dashboards
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-24 16:17:52 +02:00
Ilya K aef541829e Fix pyroscope datasource 2024-08-24 11:39:25 +03:00
raito 1fc15526d7 fix(pyroscope): add the gRPC endpoint as proxy as well
This is not documented but necessary for Alloy to operate.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-24 10:33:49 +02:00
raito 2544adba8e fix(gerrit): setup Alloy & Pyroscope more according to the docs
Still not working due to "unimplemented: error 404 not found" at push
time, but it's really unclear now why this occur.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-24 08:45:20 +02:00
raito 4f4a25a5ad feat(gerrit): push pyroscope profiling to Pyroscope
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 22:37:33 +02:00
raito 702867cd62 feat(pyroscope): add push API & reverse proxy
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 21:04:22 +02:00
raito 7cde6e92ae feat(grafana): add Pyroscope datasource
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 21:04:11 +02:00
raito 42cfa695ea dns: add pyroscope.forkos.org → meta01
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 21:03:07 +02:00
raito ac7815321a feat(pyroscope): add secrets and storage
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 20:58:08 +02:00
raito db46b01ae9 feat(monitoring): add pyroscope to the infrastructure
Vendored for the time being.
See https://cl.forkos.org/c/nixpkgs/+/181 for upstreaming properly.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 20:43:00 +02:00
raito c380f29937 fix(grafana): remove the global pgsql module dependency for now
We should re-introduce it once things are a bit scoped out.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 20:43:00 +02:00
raito 5dc6165c2e feat(gerrit): add git in the environment to perform git-native clones
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 20:43:00 +02:00
raito 0eaaf860d1 feat(common): enable system wide diff in the activation output
This helps me to review what changes could be problematic in advance.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-23 20:43:00 +02:00
raito bf1b8d4d19 secrets: rekey for public01 access to metrics
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-21 16:45:12 +02:00
raito 58c0dd3d2e feat(public): add listmonk instance on news.forkos.org
To prepare for public communications and updates.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-21 16:45:12 +02:00
raito 8c35dfa8e0 fix(gerrit): tinker a bit with gerrit defaults for transfer & caching
We had some issues in the past with too many packfiles and timeout
during transfers, let's try to provide a bit of relief in bad scenarios.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-21 16:31:16 +02:00
Yureka cfc24abfe1 adjust hydra-gc numbers
for the new ssds
2024-08-20 12:08:49 +02:00
Yureka a72a991863 add A record for cache.forkos.org 2024-08-19 23:06:46 +02:00
Pierre Bourdon f938fcb24e
hydra: increase git operations timeout 2024-08-16 17:44:45 +02:00
Pierre Bourdon 6881351f23
build-coord: copy the baremetal-builders DNS64 config 2024-08-16 09:33:48 +02:00
Pierre Bourdon d3e053809c
hydra: log_prefix needs to be / terminated 2024-08-16 09:25:46 +02:00
Pierre Bourdon e2a990c982
hydra: listen on 127.0.0.1 instead of localhost
For some cursed reasons, the latter doesn't work on build-coord:

Aug 16 07:06:22 build-coord hydra-server[109560]: Resolved [localhost]:3000 to [::1]:3000, IPv6
Aug 16 07:06:22 build-coord hydra-server[109560]: Resolved [localhost]:3000 to [127.0.0.1]:3000, IPv4
Aug 16 07:06:22 build-coord hydra-server[109560]: Binding to TCP port 3000 on host ::1 with IPv6
Aug 16 07:06:22 build-coord hydra-server[109560]: Binding to TCP port 3000 on host 127.0.0.1 with IPv4
Aug 16 07:06:22 build-coord hydra-server[109560]: 2024/08/16-07:06:22 Can't connect to TCP port 3000 on 127.0.0.1 [Invalid argument]
2024-08-16 09:20:49 +02:00
Pierre Bourdon 5fdce0e2b5
hydra: move from bagel-box to build-coord 2024-08-16 09:03:29 +02:00
Pierre Bourdon ce3a40671c
acme: make ToS and contact config common 2024-08-16 09:03:08 +02:00
Pierre Bourdon 8ffb7e51f1
tf/gandi: reduce all TTLs from 1h to 5m
Serving DNS is absurdly cheap (and we don't even do it ourselves right
now), and this makes it easier to iterate on DNS configs.
2024-08-16 08:51:31 +02:00
Pierre Bourdon b7d913b22f
tf/gandi: move hydra CNAME to build-coord 2024-08-16 08:50:35 +02:00
Pierre Bourdon c33326f836
hydra: switch to using mTLS instead of local peer auth 2024-08-16 08:19:18 +02:00
Pierre Bourdon 0dd333c573
postgres: add mTLS support
New client certs can be minted via the provided script, which is meant
to be run on the postgres server (where the CA private key is
conveniently deployed).
2024-08-16 07:59:12 +02:00
Pierre Bourdon e7f25d6ee2
tf/gandi: add a postgres CNAME to bagel-box 2024-08-16 07:34:55 +02:00
Pierre Bourdon 29babfc5c4
Revert "Partial revert "Add Grapevine Matrix server and matrix-hookshot""
This reverts commit 17c342b33e.

Grapevine's use of IFD was fixed upstream.
2024-08-15 16:22:22 +02:00
Pierre Bourdon 50fadb45e2
common: define TZ in base server configs, remove heretical host-specific configuration 2024-08-13 22:38:40 +02:00
Pierre Bourdon 37bcb261ab
ssh-keys: add build-coord, rekey secrets 2024-08-13 22:36:30 +02:00
Pierre Bourdon 5dd9ad553c
build-coord: add initial config 2024-08-13 22:36:30 +02:00
raito 3f2909dd8a public-keys: add public01 SSH host key
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-13 19:15:05 +02:00
Pierre Bourdon 90325344a3
Reserve builder-11 for build coordination, rename to build-coord 2024-08-13 19:12:36 +02:00
Pierre Bourdon 5ace7a63d8
forgejo: base on forgejo-lts since forgejo got bumped to a new master in nixpkgs 2024-08-13 01:50:19 +02:00
Pierre Bourdon 434def3337
flake.lock: Update
Flake lock file updates:

• Updated input 'agenix':
    'github:ryantm/agenix/de96bd907d5fbc3b14fc33ad37d1b9a3cb15edc6' (2024-07-09)
  → 'github:ryantm/agenix/f6291c5935fdc4e0bef208cfc0dcab7e3f7a1c41' (2024-08-10)
• Updated input 'hydra':
    'git+https://git.lix.systems/lix-project/hydra.git?ref=refs/heads/main&rev=4b107e6ff36bd89958fba36e0fe0340903e7cd13' (2024-07-22)
  → 'git+https://git.lix.systems/lix-project/hydra.git?ref=refs/heads/main&rev=f1b552ecbf2d011cd4fdb93d7d117388ab9c0027' (2024-08-12)
• Updated input 'hydra/lix':
    'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=6b4d46e9e0e1dd80e0977684ab20d14bcd1a6bc3' (2024-07-16)
  → 'git+https://git.lix.systems/lix-project/lix?ref=refs/heads/main&rev=5137cea99044d54337e439510a647743110b2d7d' (2024-08-10)
• Updated input 'hydra/lix/nix2container':
    'github:nlewo/nix2container/20aad300c925639d5d6cbe30013c8357ce9f2a2e' (2024-04-13)
  → 'github:nlewo/nix2container/3853e5caf9ad24103b13aa6e0e8bcebb47649fe4' (2024-07-10)
• Updated input 'hydra/lix/pre-commit-hooks':
    'github:cachix/git-hooks.nix/e35aed5fda3cc79f88ed7f1795021e559582093a' (2024-04-02)
  → 'github:cachix/git-hooks.nix/f451c19376071a90d8c58ab1a953c6e9840527fd' (2024-07-15)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/9355fa86e6f27422963132c2c9aeedb0fb963d93' (2024-07-16)
  → 'github:NixOS/nixpkgs/154bcb95ad51bc257c2ce4043a725de6ca700ef6' (2024-08-09)
2024-08-13 01:11:38 +02:00
Pierre Bourdon 8b1ade5580
Revert "update hydra"
This reverts commit f7907a2915.

We develop straight on lix-project/hydra, as discussed a few times on
the Lix development channel.
2024-08-13 01:11:31 +02:00
Pierre Bourdon 42b3977e8f
flake: remove an extra nixpkgs lying around 2024-08-13 00:38:51 +02:00
Pierre Bourdon 17c342b33e
Partial revert "Add Grapevine Matrix server and matrix-hookshot"
This partially reverts commit d2f3ca5624.

Said commit requires IFD to eval, which is generally unwanted, and is
currently forbidden on Hydra (imo: rightfully so, we should try to
properly separate evals from builds).

The services/ file for grapevine is kept but will not work without the
flake.nix change reapplied.
2024-08-13 00:35:10 +02:00
Pierre Bourdon ca904d7b4e
tf: use tf.ref instead of config.resource.* when dependencies matter
Using config.resource.* gets interpolated by Nix, whereas tf.ref gets
interpolated by Terraform. The latter ends up generating implicit
dependencies between resources.

In practice, the lack of dependencies was only showing up when creating
a new Hydra project + jobset at the same time - the concurrent /
misordered creation sometimes required two different TF applications to
create first the project then the jobset (the first application would
end up with a failure).
2024-08-12 19:36:50 +02:00
raito 84efd0976d feat(alerts): add a sync failed too often alert
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-09 16:25:34 +02:00
raito e2f5a7b0e4 feat(alerts): add basic postgresql alerts
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-09 16:06:34 +02:00
raito 7388de79c4 feat(alerts): add some basic "host & hardware" alerts
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-08-09 16:06:34 +02:00
Ilya K f8cad42b5c Set up alertmanager-hookshot-adapter 2024-08-09 14:03:56 +00:00