Commit graph

145 commits

Author SHA1 Message Date
Yureka bce44930b1 builders: provision ssh hostkeys on boot 2024-08-04 18:12:02 +02:00
Yureka 79dea0686b add 'notipxe' netboot loader based on systemd-initrd + u-root 2024-08-03 20:28:57 +02:00
Yureka aeb8102ae4 builders: do not mount / and /boot on netboot systems 2024-08-03 20:01:39 +02:00
Yureka 830dcbf6bc builders: do not mount / and /boot on netboot systems 2024-08-03 18:41:01 +02:00
Yureka 93822775a9 baremetal-builders: do not create swapfile on rootfs when netbooting 2024-08-03 18:10:59 +02:00
Yureka dd028656ac builders: fix serial console 2024-08-02 13:21:04 +02:00
Yureka 88317d099c attempt to fix netboot hydra jobs 2024-08-02 01:05:20 +02:00
Yureka 1cbf286f18 build netboot files from hydra 2024-08-01 22:47:25 +02:00
Yureka 6dc424dd43 wob01: serve an ipxe over iusb-spoof 2024-08-01 22:16:48 +02:00
Yureka 504a443acc adjust hydra-gc numbers
we want to see how garbage collection would behave on a 480GB drive
2024-07-31 23:44:08 +02:00
emily 96d58bbd41
forgejo: disable users explore page
This was requested and should make it a decent bit more difficult to get
a somewhat complete list of users on this instance.

We are, however, aware of other endpoints that can be used to get to a
similar result. Those just aren't as convenient nor obvious.

https://forgejo.org/docs/latest/admin/config-cheat-sheet/#service---explore-serviceexplore
2024-07-31 01:42:05 +02:00
Yureka 5154906aac fix eval in assignments.nix 2024-07-30 17:23:54 +02:00
Yureka f3828368e6 hydra: set reasonable max-jobs and cores 2024-07-30 17:03:12 +02:00
Yureka 4e2d21930f baremetal-builders: detect percent_filled for the correct partition 2024-07-30 13:59:46 +02:00
Yureka 99259356f2 make buildbot-signing-key accessible to buildbot-worker 2024-07-28 23:30:38 +02:00
Yureka 5474832b07 baremetal builders: filesystem optimizations 2024-07-28 19:20:23 +02:00
Yureka 15a684c5d7 baremetal-builders: more 'intelligent' gc 2024-07-26 12:17:27 +02:00
Yureka 74e06ac6d0 hydra gc every 20h
metrics analysis has showed that this is unlikely to fill up the builders
2024-07-24 09:35:18 +02:00
raito e5a3ce2283 buildbot fixes (#76)
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
Signed-off-by: Yureka <yureka@forkos.org>
Co-authored-by: raito <raito@noreply.git.lix.systems>
Co-committed-by: raito <raito@noreply.git.lix.systems>
2024-07-24 06:44:25 +00:00
Ilya K bebc7f2586 We have nothing to hide 2024-07-23 18:09:49 +03:00
Pierre Bourdon 608c0e5973
hydra: bump to 16 evaluation workers, we have enough RAM and cores to afford it 2024-07-22 23:13:33 +02:00
raito 62ccc0282b fix(ows): per-job runtime directories + proper local refspec
The local refspec was weird and exploiting a edge case for the nixpkgs
jobs where local and from were the same.

We are more explicit now, which fixes the sandbox jobs.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-22 15:41:47 +02:00
Yureka d84a43b781 builders: run gc 3x per day
We can still adjust it if the disks fill up, but currently it is too frequent
2024-07-21 19:49:21 +02:00
Yureka 2dc5899660 baremetal: run hydra store gc as builder user 2024-07-20 17:00:39 +02:00
Yureka adaf4b0aef baremetal: tmp on the same filesystem as hydra store 2024-07-20 17:00:39 +02:00
Yureka 5bde7e2358 use dedicated store partition for hydra builds 2024-07-20 15:14:00 +02:00
Yureka d9809e1e78 gerrit-one-way-sync: disallow auto-merging a staging iteration into master 2024-07-20 15:14:00 +02:00
Yureka 3fa4a25d87 gerrit-one-way-sync: set git user info 2024-07-20 15:14:00 +02:00
Yureka 0ff5eea4ed gerrit-one-way-sync: merge instead of rebase 2024-07-20 15:14:00 +02:00
raito 80c4757571 gerrit01: add a one-way-sync service
It's basic and does not handle conflicts which needs to be manually
managed.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-19 17:52:44 +02:00
Ilya K d1e64b6610 Fix eval warning here too 2024-07-19 12:06:03 +03:00
Ilya K 766dc4c383 Mimir also wants network-online.target
Thank you helpful eval warning
2024-07-19 12:03:55 +03:00
Ilya K 65b07a936b Make sure Mimir starts after network is up 2024-07-19 12:00:52 +03:00
raito 8afcf249d6 buildbot: upgrade to local machine specifications
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-18 12:18:02 +02:00
raito 4473717e9f gerrit: introduce buildbot checks plugin
It's a modified version of @puck's Lix buildbot checks for
gerrit.lix.systems with a slight generalization in the configuration for
many repositories.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-18 10:56:46 +02:00
raito da7175303c buildbot: add support for remote builders via baremetal machines
For now, only builder-3 is used.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-17 18:28:26 +02:00
raito 7789e9ce75 services/buildbot: init
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-17 18:00:51 +02:00
raito fda59ee6c0 gerrit: factor more configuration in the NixOS module for external consumption
Other modules may require information to configure themselves from the
Gerrit module.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-17 15:43:35 +02:00
emily 95b58de737
forgejo: use redis as cache and session provider 2024-07-16 20:09:15 +02:00
emily 8b9d33d70c
forgejo: disable registrations, enable auto-registration for SSO 2024-07-16 17:14:23 +02:00
emily dd069c40d7
forgejo: init service 2024-07-16 15:44:06 +02:00
Luke Granger-Brown e3e60a5e72 services/monitoring: add scraping of Gerrit's internal metrics 2024-07-15 11:02:54 +00:00
Luke Granger-Brown 2e86babc8a services/gerrit: add metrics-prometheus-exporter 2024-07-15 11:02:54 +00:00
Ilya K 7a937e837a Unlimit Mimir max series 2024-07-13 15:52:46 +03:00
Pierre Bourdon 7d9461808c
builders: configure a swapfile + zswap 2024-07-13 04:40:51 +02:00
Pierre Bourdon 293bc52ace
hydra: reduce number of parallel builds per builder to limit RAM consumption 2024-07-13 04:38:24 +02:00
Pierre Bourdon 756341ea4c
builders: tune sshd MaxStartups to avoid rate limiting Hydra 2024-07-12 21:57:04 +02:00
Yureka e6ead602f0 builders get a special treatment for dns64 2024-07-11 02:05:58 +02:00
Yureka b14f155d55 add ipmitool on vpn-gw and builders 2024-07-10 20:49:17 +02:00
Pierre Bourdon d2336262fb
hydra: set allowed URIs in restricted mode for flake inputs 2024-07-10 18:52:22 +02:00
Pierre Bourdon 411d514ab9
hydra: user hydra-www needs nix-daemon access too 2024-07-10 17:36:39 +02:00
Pierre Bourdon f74d1ca0f6
hydra: start signing paths 2024-07-10 17:34:57 +02:00
Ilya K e84b362b7a Allow 12 hour of backfill for metrics
This is somewhat experimental and may explode, but we'll see, I guess
2024-07-10 14:59:09 +03:00
Ilya K 9e7e6d42ab Make nginx/loki/mimir go fast 2024-07-10 14:55:28 +03:00
Pierre Bourdon f2c2bc5ab6
hydra: output machine host key as base64 in the generated machines.conf 2024-07-10 02:16:45 +02:00
Pierre Bourdon f214da9228
hydra: add hydra to nix trusted-users 2024-07-10 02:03:33 +02:00
Luke Granger-Brown 82db8f7f1e gerrit01: some more tuning
* flip off proxy_buffering again
* enable REVWALK_USE_PRIORITY_QUEUE
* enable delta compression, because that's not a bottleneck and it's
  nicer on bandwidth
2024-07-10 00:27:36 +01:00
raito 9988811be5 hydra: unplug the EPYC
thank you for your testing services

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-10 01:13:10 +02:00
raito 2308870aa5 builders: add a nice tag to deploy all of them at once
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-10 00:59:31 +02:00
raito 645ad7d062 builders: add builder user
currently hardcoded to hydra's coordinator public key

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-10 00:55:25 +02:00
raito a30c1f7d78 hydra: wire up new builders
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-10 00:45:02 +02:00
Yureka eb21cb6916 add baremetal builders 2024-07-10 00:35:01 +02:00
raito 3828721e4f services/netbox: enable OIDC via Lix SSO
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-09 02:45:58 +02:00
Luke Granger-Brown 8a9ff8c40d services/gerrit: migrate to Gerrit from the-distro/nix-gerrit flake 2024-07-08 23:30:59 +01:00
Pierre Bourdon 7f46e5d9a4
services: add ofborg, currently running rabbitmq only 2024-07-08 23:55:11 +02:00
Ilya K b55475c12e Fix up the rest of the dashboards 2024-07-08 11:43:57 +03:00
Ilya K 9f0e601d84 Scrape grafana/loki/mimir own metrics 2024-07-08 10:25:15 +03:00
Ilya K 209f71c63a Update node_exporter dashboard for new metrics structure 2024-07-08 10:16:37 +03:00
Ilya K 563e0685d4 Metrics fixups
- fix grafana-agent config format
- rekey metrics-push-password for fodwatch
2024-07-08 10:01:25 +03:00
emily 8d2a367e92 grafana-agent: make bagel.monitoring.grafana-agent.exporters an attrset
This allows us to use multiple jobs, one for each additional exporter,
and set their `job_name` accordingly.

`job_name` is exported as `job` label on the resulting metrics.
This allows us to quickly get an understanding what metrics of an
exporter are actually available by simply filtering all metrics by
`{job="$jobname"}`
2024-07-08 09:34:26 +03:00
emily db8c831c2f grafana-agent: set hostname label on all metrics
This is handy to quickly see all metrics exported by a node, without
having to mangle with the already existing `instance` label.

`hostname` is essentially a variant of `instance` but without ports.
2024-07-08 09:34:26 +03:00
Ilya K ba0d50624d Switch to push metrics with Grafana Agent 2024-07-08 09:34:24 +03:00
Ilya K 40ba3c4ae7 Prepare for remote push metrics 2024-07-08 09:33:59 +03:00
Ilya K 346a74eabc Wire up Grafana to Alertmanager 2024-07-08 09:33:59 +03:00
Ilya K e8e262c6a4 Enable Mimir Alertmanager, add example alert
Still TODO: actually connect it to Matrix
2024-07-08 09:33:59 +03:00
Pierre Bourdon caa1fce74e
hydra: move to hydra.forkos.org 2024-07-07 23:53:21 +02:00
Ilya K 5b0f3c4541 Split node_exporter and cadvisor config, disable cadvisor for nodes that are themselves containers 2024-07-05 20:06:43 +03:00
raito b319b02f07 fix: remove custom logging format for Gerrit
This way, we get picked up by the LGTM stack exporter machinery.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-05 18:52:38 +02:00
Ilya K 2441d18f17 Add Loki + Promtail setup 2024-07-05 16:10:31 +00:00
Ilya K 03cb9c390c Add postgres exporter 2024-07-05 16:10:31 +00:00
Ilya K 42f8ad8fa4 Add nginx log exporter 2024-07-05 16:10:31 +00:00
Ilya K 63b31e98cf Add Grafana/Prometheus/Mimir minimal setup
More later, Loki also later.
2024-07-05 16:10:31 +00:00
Pierre Bourdon 34a29552da
hydra: update the epyc.infra.newtype.fr public host key 2024-07-05 16:43:29 +02:00
raito 0b01e9a99f gerrit01: those who finetune even further
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-05 12:23:44 +02:00
raito 6c237e8d40 gerrit01: make it go brrr on https clone
proxy_buffering was the root cause.

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-04 14:42:49 +02:00
Pierre Bourdon e387fffd66
hydra: add i686-linux support to the remote builder because nixpkgs bootstrap relies on it, even on x86_64 2024-07-04 13:44:59 +02:00
raito 182e55c35f gerrit01: rename to cl.forkos.org
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-03 10:58:49 +02:00
raito 98a33e4300 gerrit01: init
With:

- A package hierarchy
- A source-based Gerrit deployment

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-01 21:22:36 +02:00
raito e3f3c87c0d meta01: init
Includes:

- Raito VM module
- Raito proxy aware NGINX module
- Base server module
- Sysadmin module
- New SSH keys
- Netbox module

Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
2024-07-01 19:40:37 +02:00
Pierre Bourdon be5c6f0656
postgres: fix permissions on the dataDir, it refuses 0770 2024-06-24 21:45:17 +02:00
Pierre Bourdon 2ed6f92ed8
postgres: bump max connections count 2024-06-24 21:45:17 +02:00
Pierre Bourdon cb6e5b1652
hydra: actually use version from flake 2024-06-24 21:45:17 +02:00
Pierre Bourdon 73aecaef41
hydra: provide S3 and SSH credentials (via agenix) 2024-06-24 20:59:19 +02:00
Pierre Bourdon 04bd33e32c
infra: add agenix, add s3 credentials 2024-06-24 18:03:20 +02:00
Pierre Bourdon 91beb0eddc
bagel-box: add postgres+hydra 2024-06-24 18:03:20 +02:00