lsof in tests is never exercised #156
Labels
No labels
Area/build-packaging
Area/evaluator
Area/flakes
Area/profiles
Area/remote-builds
Area/repl
Area/store
bug
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
RFD
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
ux
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#156
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Either this needs to be removed altogether or it needs to be fixed.
export _NIX_TEST_NO_LSOF=1
Brought to attention in https://gerrit.lix.systems/c/lix/+/580/comment/4f83a1c6_4e8cb552/
lsof is a somewhat weird dependency. While it should work, macOS has several functions in libproc.h that may allow us to get this information directly more efficiently than lsof.
If the long testing time going above a reasonable timeout is the main issue then replacing the implementation may be a good idea. I can try writing this, though I probably want to move os-specific code out of gc.cc since the number of ifdefs would get annoying
A bit more info now that I figured out how to do stuff on a mac: upstream lsof is highly inefficient on macs.
On my 2012 MacBookPro9,2 with an i5-3210M running macOS 14.4.1 and very little happening:
/usr/sbin/lsof -n -w -F n >/dev/null
takes 240ms/run/current-system/sw/bin/lsof -n -w -F n >/dev/null
takes 40 secondsIt's not entitlements (I unsigned the system lsof and it's still as fast). Checking in dtruss it seems to be that upstream lsof makes 50x as many proc_info syscalls as system lsof.
It looks like the reason
nix-store --gc --print-roots
is reasonably fast on macOS outside of testing is that it's using system lsof. During build configure.ac falls back to-DLSOF="lsof"
if the lsof command isn't found, so nix runs whatever is in PATH.The options for making the nix gc take a reasonable time are probably:
/usr/sbin/lsof
instead of upstream lsof (I have tried and tests work)I'm working on a rewrite using libproc but if people want a solution with fewer moving parts then using
/usr/sbin/lsof
would work.I discovered why upstream lsof is so slow: There's an undocumented API that's been in XNU since OS X 10.10 that allows you to ask for only regions (like what you'd find in /proc/pid/maps on linux) backed by a file. Upstream lsof doesn't use it, so has to go through every region, which means many thousands of additional syscalls.
See my comment in a WIP commit:
e6c0972318/src/libstore/gc.cc (L479)
https://gerrit.lix.systems/c/lix/+/723