Locale for the daemon on macOS is plausibly wrong #511
Labels
No labels
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/store
bug
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
RFD
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
ux
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#511
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
It would be extremely plausible that the daemon might be running accidentally in a C locale on macOS since LC_CTYPE is set by Terminal.app but possibly not by launchd. Someone needs to check that the Lix daemon on macOS has correct locale when started by launchd.
Running with a non-UTF8 locale can cause all kinds of weirdness. It appears that POSIX states that strcasecmp (which we use for casehack verification) uses LC_CTYPE, which may not be set in our launchd plists and thus casehack may be broken for unicode filenames.
I expect we already have a few roundtripping corruption bugs of utf-8 filenames inevitably, at least on HFS+ where filenames are stored in NFD (this implies that the case hack is bork, though we already know it is bork, this is a new variant of bork #332).
Fundamentally you cannot extract NARs on every system and expect them to work because the filesystems may have wildly different opinions about what a filename means unicode wise, and the only way to find out is to fuck around (namely, eat a syscall error). Case sensitivity is merely the most obvious instance in the English speaking world.
HFS+ stores filenames in not-quite-NFD, APFS is normalization-preserving but keys on normalized file names [1] and Apple tried to hide this by messing with the Cocoa APIs (which mean we are unaffected by Apple trying to normalize our file names for us, but we are affected by multiple normalizations colliding).
[1]: the hashes of filenames are looked up by (if case insensitive) case-folded and normalized versions of the filenames, so you cannot have two different normalization forms of the same filename
https://eclecticlight.co/2021/05/08/explainer-unicode-normalization-and-apfs/
This issue was mentioned on Gerrit on the following CLs: