Figure out the Gerrit/Pyroscope situation #108
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Right now we're using Grafana Alloy to profile Gerrit, which does some horrible things with a vendored prebuilt profiler that doesn't even work correctly (see:
2c70be2d20/internal/component/pyroscope/java/asprof/asprof.go
). I'm not sure what a good solution is here. Patch Alloy to use async-profiler from nixpkgs? Just invoke async-profiler ourselves and shove the output into Pyroscope somehow? Pyroscope can accept some Java specific format: https://grafana.com/docs/pyroscope/latest/configure-server/about-server-api/#jfr-formatWhat's wrong with vendored asprof, exactly?
I'm not sure how easy it would be to patch Alloy, but from looking at the docs of asprof, it seems not very user-friendly at all. It pretty much does profiling and that's it. It wouldn't be too hard to write a wrapper around it that would launch and stop it and so on and would send profiles to Grafana, but then it's just Alloy all over again. It's not too terrible though, so let's identify the issues that Alloy has and see if we can fix them.
It doesn't actually run properly, see https://grafana.forkos.org/explore?schemaVersion=1&panes=%7B%22s94%22%3A%7B%22datasource%22%3A%22loki%22%2C%22queries%22%3A%5B%7B%22refId%22%3A%22A%22%2C%22expr%22%3A%22%7Bunit%3D%5C%22alloy.service%5C%22%7D+%7C%3D+%60asprof%60%22%2C%22queryType%22%3A%22range%22%2C%22datasource%22%3A%7B%22type%22%3A%22loki%22%2C%22uid%22%3A%22loki%22%7D%2C%22editorMode%22%3A%22builder%22%7D%5D%2C%22range%22%3A%7B%22from%22%3A%22now-24h%22%2C%22to%22%3A%22now%22%7D%7D%7D&orgId=1
Don't have Explore access in Grafana :(
To be a bit more precise, it works some of the time, but often fails unpacking/setting up the vendored analyzer for some reason. It usually fixes it self after some time. So it's not a pressing issue but still annoying.
Why does it try to fork/exec anyway?
I looked at the code for a few minutes and my best guess is that Alloy uses fork/exec to run asprof in a separate process, and maybe it re-runs it every once in a while. I think it wants to look for some data (musl or glibc libs, for example) in a tmpdir. And since tmp is, well, tmp - the tmp folders are cleaned up sometimes and Alloy freaks out. A proper fix would probably create a new tmpdir for each restart, or use some more stable directory (StateDirectory, maybe?)
Looks like it's actually documented and configurable. I'll try submitting a PR to fix it.