lix regression for running flake update from within git rebase due to GIT_DIR environment variable #1135
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
awaiting
author
awaiting
contributors
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
Feature/S3
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lix-project/lix#1135
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
I have recently moved a system from NixOS 25.11 to NixOS unstable while also using Lix main.
Running
nix flake updateis running into very confusing issues now when run from within git rebase (more infos on the workflow leading to this below).Effectively the
git rev-list --max-count=1 refs/heads/mainrunning against the cached repo (~/.cache/nix/gitv3) will fail with the below error on my local machine, or a similar error in a sandbox, as long as theGIT_DIRenvvar is set to the current directory:Steps To Reproduce
The following is a hopefully reasonably well documented reproducer shellscript.
Run this within a sandbox to reproduce.
shell script reproducer
shellscript output
Expected behavior
Running
nix flake updatefrom withingit rebase -ishould Just Work™ (IMHO).Especially given that older Lix versions (specifically the one from 25.11 at time of writing) did work just fine.
This is compounded that this seems (I haven't checked too hard, my head already hurts from debugging this) to only ever trigger when using worktrees.
nix --versionoutputAdditional context
Just to explain why this is important to me; I have several stacked branches basically inheriting each other (main ⇒ staging ⇒ unstable) with commits on them being incremental.
Rebasing all of those every time gets old real quick, so what I am doing instead is to work on worktrees where changes are actually disconnected, but keep all of those branches on a single contiguous headless worktree.
That worktree then has a .rebase file which takes care of picking the right commits and potentially does some resetting and whatnot.
This for me has become a rather useful workflow that the
GIT_DIRstuff cuts into in a really weird way (easy to handle, but still weird and.… unexpected is the word I guess)..rebase example
Oh, just to be clear, I have no idea whether this is a Lix-specific bug, or whether some change in git may have caused unintended side-effects, if this turns out to be an actual issue in the way git is handling things then of course I don't think Lix needs to change anything really. This bug is only filed against Lix because it does (seemingly) use the git from PATH, meaning that when I run the older version of Lix and it succeeds, this is likely due to a change in Lix, not git itself.
I also think that Lix should keep passing on
GIT_DIRto any calls to git underneath as long as they touch the current repository. There should only ever be an attempt to isolate the git calls which deliberately deal with a different repository. The cache repositories and their operations should not be influenced by anything going on in the context of the flake repository they are a dependency of IMHO.after the first setup block of your script (up to and excluding the first comment cat) this reproduces with
but flipping the directories (env vs -C arg) fixes it too. definitely a git issue. considering the amount of bustage we've seen with worktrees that's not super suprising :/
@pennae wrote in #1135 (comment):
It just clicked for me;
-Cisn't the right argument for Lix to pass here. Lix should use--git-dir(which is whatGIT_DIRdoes), since-Conly makes all paths interpreted as relative to that directory1, while--git-diris an authoritative "use this git repository" which is what all the git commands Lix invokes are meant to do. Since--git-dirhas higher precedence thanGIT_DIRthis also fixes things down the line. I'm not sure if there's any operation in which Lix depends on the actual path, rather than just the repository, but if there is, both parameters should be provided.Which, surprise twist, is what
the old Lix version doesall other git commands used by Lix seem to do (I totally missed that interaction when going through the logs):It does use
-Cto change the directory, and then specifies--git-dirto be.which is interpreted relative to the-Cargument.Which raises the question why this
was changedis different for rev-list.?Edit: sorry, the log of the old Lix version bails due to a different error exactly one command earlier, the versions do more or less the same thing2.
Quoting git(1): This option affects options that expect path name like
--git-dirand--work-treein that their interpretations of the path names would be made relative to the working directory caused by the-Coption. ↩︎The older version did bail during testing due to issues with packed refs, which the new version seems to be fine with, other than a lot of
could not update mtime for file ''warnings. ↩︎okay, so this is probably
6bf187537athen. before that change we never shelled out to git to read refs, after that we do (and that new command indeed does not set--git-dir, all others do). goot griefI've got a patch over at my lix fork.
x86_64-linux: ✅aarch64-linux: ✅The regular package tests all pass fine.
Running the reproducer against the patched version yields two successful rebases.
Locally the new build also works as it used to.
nix flake checkfails over unrelated matters.All of
hydraJobs.tests(at least those having an out attr) are passing.I'll go ahead and push the change to gerrit.
This issue was mentioned on Gerrit on the following CLs:
I didn't ask before, but why does this use
rev-list --max-count=1instead ofrev-parsewhich I think would do the same thing?not entirely sure, but since
rev-parseis notionally a frontend torev-listit does make sense to directly use the low-level tool for the low-level job. (we're more confused that it sets--max-count=1instead of--no-walk, but since both work why bother)