jujutsu support in libfetchers #799
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
5 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#799
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Is your feature request related to a problem? Please describe.
I'd like to be able to run nix apps from a jujutsu repo. However, if files are added in the jj working copy, they will not be in Git's HEAD (which is the parent of the jj working copy), nor the git index (which isn't used at all AFAIU), so they will be missing if I simply run
nix run .#hello
.Describe the solution you'd like
A jujutsu fetcher could be added; this would be a thin wrapper around the git fetcher, but would resolve the jj expression
@
into a git commit ID rather than using the git index tree as its default behaviour (when norev=
orref=
is specified).Describe alternatives you've considered
As a workaround, I can run
nix run .?ref=$(jj log -r @ -T commit_id --no-graph)#hello
.When I was using nix with sapling, I had a fake .git which contained a fake index which contained all the files I wished nix to see. This was janky garbage but it worked. I dunno if jj can be convinced to have a fake index, but it is a possibility workaround wise.
In a colocated repo, there is already a "real" .git directory that reflects the commit prior to the one that is currently edited. Newly created files are added automatically to the jj repo, but not to the Git one until the next commit. Of course there exist workaround (like the one from the initial comment) or adding them to the Git index manually, but the point is that they are annoying.
They are added to the git repo too, as commits even! Otherwise my workaround above wouldn't work. They just don't have any git refs pointing to them.
This sounds like the exact same problem in Git as "I created new files but Nix doesn't see them until I add them to the index". The only difference is that in jujutsu there is no index. The simple workaround here is to just run
jj new
so the files now exist according to git, and to then just usejj squash
to merge any subsequent changes into the commit (since I assume you're still hacking on the commit).I do think there's value in teaching Lix about jujutsu repos, but specifically for non-colocated repos (where teaching Lix about it just means teaching it to run
jj --ignore-working-copy git root
to find the git dir if it sees a.jj
dir, and then invoking the git import path; this would require extending thegit+file:
scheme with agitDir
param). If we support non-colocated repos then you could make an argument that the calculation of thegit+file:
scheme should also resolve the working copy to its git hash for therev
param.But absent support for non-colocated repos, this feels a little less compelling. And this could actually be worked around on the jujutsu side too; there's an argument to be made that jujutsu should do
git add -N
for any files added in the working copy, so that way they show up on the git side as working copy paths rather than untracked paths. This would transparently solve the problem in Nix (and would be helpful in other ways too).I suppose you could actually just run
git add -N .
yourself in the colocated git repo, that shouldn't interfere with jujutsu, you'll just have to re-run that if you ever switch the working copy away and back (or after adding more files).In fact, it looks like this has been mentioned before in the jujutsu discord and Martin said there's no objection to doing this just nobody's gotten around to it, so I'm sure they'd welcome a PR! Someone else said conflicts pose a problem but at the very least it should be fine to do
git add -N
for new files if the working copy is not conflicted.@lilyball wrote in #799 (comment):
This is my understanding (for colocated repositories, which the author of the issue seems to have had in mind too) as well. I'd say the difference is mainly in expectation: a git user may very well not care about untracked files that much (but enough users do that there exists a Nix issue about this somewhere), while a jj user will definitely care about git-untracked files due to the absence of the index.
The nice thing about this is that it does a snapshot. The problem with doing it as a fetcher is that either it would have to cause snapshots (which means shelling out to
jj
and a whole variety of potential weirdness/failure modes from that), or it just replaces “forgot tojj new
so my file isn’t seen” with “forgot tojj
to snapshot so my file isn’t seen”. Though, with the Watchman snapshot trigger, you get snapshots for free and the naive non‐snapshotting fetcher would behave nicely.Do you have specific potential weirdness/failure modes in mind for the auto-snapshotting? The most obvious things that come to mind for me are:
jj
not being around (in which case I'd say error out, if there's a.jj
)Am I missing anything else? snapshotting is a side effect, but I'd estimate it to be pretty harmless and unsurprising given that all the jj commands do it too?
Well, it’s a mutating operation, so it may be surprising for fetchers to mutate the state of what they’re fetching in general. But you could argue that with Jujutsu snapshotting is meant to be non‐destructive and is more like
atime
than anything. It can also be slow, but that’s true of fetchers in general.However, I think the biggest problem is concurrent operations. Let’s say you are working on a commit message with
jj desc
, and you also change a file, and run a command that causes a snapshot. Once you finishjj desc
, the commit with the old tree but the new description and the commit with the new tree but the old description will coexist, and you’ll see the change ID become divergent since both have the same one. That’s because the two operations had the same base operation and were automatically merged with a CRDT to produce the new repository state.Usually, this doesn’t happen too often because you have to explicitly run multiple mutating
jj
commands simultaneously (possibly virtually using--at-operation
). However, it causes a lot of support requests with things like editors trying to be smart about doing fetches and snapshots in the background, and people sometimes caution against the Watchman snapshot trigger (which causes a snapshot on every file operation) for the same reason.Personally, I used to use the Watchman snapshot trigger, and I currently have my shell prompt asynchronously do a snapshot, and although it does sometimes cause divergent changes I’m pretty happy with the overall UX, and am generally very fond of not having to think too much about when snapshots happen. But “running a
nix build
can mutate the repository it’s fetching from in a way that causes divergence that the user has to manually resolve” is at least surprising enough to take into consideration, especially when it can happen automatically on new shell prompts due to use of direnv.Another thing to consider is that Jujutsu isn’t inherently tied to the Git backend, and although the example native backend can be ignored for all practical purposes there is work ongoing on fancier native backends with VFSes and cloud stuff and so on. So this would be creating more of a
git+jj+file:
than an actual Jujutsu backend which would take a lot more design work. (There’s already a cloud backend at Google but uhh presumably nobody is using Nix withgoogle3
. Can you imagine the flake copying UX?)Anyway, I do want something like this feature myself, and hate it when I am forced to think about file tracking because of Nix, so I hope I don’t seem like I’m trying to shoot the desire down. I’m just not sure if the path forward is clear enough to bake it into the codebase at present. Once the
git add -N
functionality lands in Jujutsu you could enable the Watchman snapshot trigger or have your shell prompt do snapshotting and get the same results without Lix being aware of it. It might be worth setting up one of those to see if the op log merges you’d get with this functionality in Lix bother you or not. (Note that Watchman can be kind of buggy though, especially on macOS. I reduced my reliance on it because it would fail to notice files under some circumstances. Jujutsu really wants a VFS for workspaces… once it has one, the advantages of direct integration would be huge, although much more complex too. “We have lazy trees at home.”)For myself, I’ve personally just started hacking on a non‐flakes pure‐eval thing that preferentially uses the
jj log
invocation so that I can get the desired UX out‐of‐tree. I know that’s not a good solution to the general problem, though.Thanks for the details! It definitely doesn't feel like you're trying to shoot it down, and I appreciate the input from someone who knows more about jj than me ^^
Do you think it would be possible to get jj to snapshot "into" something other than the current working commit? I don't see anything obvious for that in the CLI right now, but it seems like an operation that could be useful both for us and for other similar use cases.
I've also found myself several times now wishing that running nix commands would snapshot the repo I'm working on, so personally I think it would be useful at the very least as an opt-in behaviour.
Dreaming of further possibilities a bit: maybe it could even somehow annotate the created snapshot with "this command was run with this revision at this date". Extra cool if it happened with local dirty flake inputs (or similar, in the flakes case when using
--override-input
) as well.@lheckemann wrote in #799 (comment):
Actually… maybe! https://github.com/jj-vcs/jj/pull/4457 adds an option to not commit the transaction caused by a command. It is possible that
jj log --no-commit-transaction --no-graph --color never -T commit_id -r @
will give you a commit ID that is accessible in the underlying Git object store, but that is dangling and not rooted to any Git reference. In that case, Lix could happily copy that commit without it resulting in any mutation to the visible state of the Jujutsu repository (albeit still involving making writes to a repository you build from, which again seems unexpected but may not be the end of the world).I was unsure whether this would result in
jj/keep/*
refs being committed to Git that would be picked up the next time Jujutsu imports from Git – putting you back at the divergent change result – but I asked on the Jujutsu Discord and it seems that although it would createjj/keep/*
refs (to avoid a race with Git GC), it would not change the Jujutsu state of the visible head commits and its view of Git refs, and--no-commit-transaction
would not export any changes to “actual” Git refs.git gc
would not collect the commits due to thejj/keep/*
refs. The resulting operation would not show up injj op log
, and would be cleaned up byjj util gc
. So it should also not cause race conditions with automatic Git GC or unlimited repository growth. It could race withjj util gc --expire=now
, however. I expect that to not be a big deal in practice and not a blocker for an initial version of this feature, but it might be possible to devise a mechanism for tooling to communicate that it is holding a reference to a Jujutsu operation (for the duration of the copy), or to lock GC entirely. (Adding and cleaning uplix/keep/*
refs would suffice for the former even without coordination with Jujutsu, for instance. Edit:Actually, this wouldn’t work because Jujutsu would import those, I guess. It’s also probably not really relevant since it’s pretty edge‐case‐y and the failure mode would just be “Lix complains about an expected object not existing and errors out”.Edit edit: Actually it would work perfectly because those would not be branches but just random refs so Jujutsu would not import them.)So… this might just work? If people are okay with the coupling to the Jujutsu CLI and mutation of the underlying repository (albeit not of the visible state from the perspective of
jj(1)
), then I think the things to resolve would be:What to call it, since this is very much tied to the Jujutsu Git backend and is really more of a way to infer a
rev
for the Git snapshotting process than something that actually works natively with Jujutsu’s idea of repositories.Whether Jujutsu repositories should be automatically detected – especially colocated ones, where the user may be expecting tooling to treat them as Git repositories.
Something like
git+jj+file:
makes sense to me for (1), and I would personally prefer (2), even for colocated repositories, since the result is kind of just better. (But, like – maybe you have scripts that do both flake‐ynix build
s and also Git commands; like they want to make new commits or record the commit hash being built. In that case they might work with a colocated repository currently but get confused by this integration if it’s automatic.)I agree that it would be nice. Actually, it’s kind of funny that most of the things that are weird/terrible about the whole flake Git copying UX other than performance – “it looks at the working copy version of files rather than
HEAD
”, “it cares about what files existed atHEAD
and adding files to version control isn’t automatic” – are things that are resolved cleanly in Jujutsu’s model; “the tree of@
is added to the store” is easy to understand and has a very smooth UX.A VFS‐based Jujutsu backend with sufficiently advanced integration could also eliminate the copying entirely while still maintaining consistent observed file state and even having a proper canonical path with the correct NAR hash ready to go, although of course even if such a thing currently existed the integration would involve major surgery to the Nix side of things and I am sure there is no appetite to rush towards doing things that look like lazy trees. In general though, I think that proper VCS–build system integration is the future and that we could potentially one day have very nice things.
That all sounds fantastic! I would also much prefer detecting jj repos automatically if present (and failing hard, informing the user how to explicitly use git, if the
jj
CLI isn't available).I agree that GC safety doesn't feel like a problem we're likely to run into even if we don't account for it explicitly. Locking GC entirely does however seem like an easy way to get that safety without significant disadvantages (and we could look into smarter strategies if/when this "big hammer" approach ever does turn out to be a problem -- at which point it would demonstrate that we do in fact need the safety).
As far as VFS stuff is concerned, yeah, not something I can see the Lix project rushing into, but it absolutely does feel like something that would mesh really well with jj.
I forgot one additional important point. When there is a Jujutsu‐side conflict, the Git trees look like this:
This makes using
@
rather unpleasant if there are conflicts. I think you would ideally want to match the way conflicts are materialized in the on‐disk working copy in Jujutsu. I do not know if there is a simple way to do that without actually getting Jujutsu to materialize them in a workspace. I think there has been talk about combining--no-commit-transaction
with workspaces, which would allow that, but it would essentially duplicate the copying overhead.I assume the appetite for reimplementing Jujutsu’s conflict materialization logic is nil, as it probably should be.
jj file show
can do it for any given file. Any file that is identical between.jjconflict-side-*
should be safe to materialize as‐is (although confusingly this is not true of the underlying patch theory, when it differs in the base, and it is possible that Jujutsu will one day gain a conflict handling mode that exposes this). The rest are conflicted and I don’t know what the ideal solution would be. But the annoying thing about not doing anything at all is that yourflake.nix
etc. seem to disappear as soon as you have a conflict in any file.I think there has been discussion, or maybe even implementation, of materializing conflicts within the Git index to reduce this kind of problem. However if we copy directly from a Git commit object we will not benefit from that.
Good point! I think that detecting conflicts and failing hard on them (with a helpful error message) would probably be reasonable and easy, at the very least for a start? Doing any sort of conflict materialisation would be weird, especially because how they're materialised is configurable.
That seems okay for an initial version, but the most annoying thing about it is that it means your flake development shell breaks when a random source file is conflicted, which then gets in the way of having the tools you need to resolve the conflict intelligently in the first place. With Git that doesn’t happen because it just looks at the worktree contents which include the conflict markers in the relevant files and everything else is normal.