[llvm-dev] RFC: Dealing with out of tree changes and the LLVM git monorepo
Justin Bogner via llvm-dev
llvm-dev at lists.llvm.org
Wed Oct 31 16:48:20 PDT 2018
NAKAMURA Takumi <geek4civic at gmail.com> writes:
> Justin,
>
> Could you show me an example of longer tree to to be migrated?
> It's okay if one is not yours but public in the github.
Unfortunately my work is a proprietary backend, so I can't share it. It
would take quite a bit of effort to make something artificial that was
realistic.
You could perhaps looks at something like swift if you wanted to
experiment, but I don't know how complex their branching structure is.
> I suggest we may provide the script to migrate deep tree.
>
> 1) Generate svnrev-hash maps for each the monorepo and other individual.git.
> (It may be delayed until (3))
> 2) Do git-fast-export the branch.
> 3) Do git-fast-import with substituting out-of-branch hashes.
>
> I am not certain git-fast-export would be mature.
> In contrast, I am certain git-fast-import is mature.
I have doubts about how effective this would be, and even if it works it
means every hash that's recorded in my bug tracker, in my commit
messages, and in release notes becomes invalid.
This seems much worse than the zipper layout to me.
> ps. I tried the zipper layout several years ago and I concluded it was not
> useful.
> It's the reason why, in my monorepo, I grafted some commits to each
> corresponding commits of individual.git.
> It just guaranteed my monorepo isn't orphan.
> Note, I don't think such grafts were really useful.
I'm not sure I understand what problems you found. Have you looked at
the repo with zipper layout I've prototyped at
https://github.com/bogner/llvm-zipper-prototype ?
>
> Takumi
>
>
> On Thu, Nov 1, 2018 at 1:22 AM Justin Bogner via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi all,
>>
>> I've spent some time in the last couple of days trying to figure out how
>> to adopt the [LLVM git monorepo prototype] for an out of tree backend.
>> TLDR: I'm not convinced that this prototype is the right approach to
>> converting to the monorepo, and I have a possible alternative.
>>
>> The main problems I'm running into stem from the fact that this
>> prototype rewrites all of history from scratch rather than leverage the
>> existing [official git mirrors]. This makes migrating out-of-tree work
>> from the official git mirrors to this repo very difficult, since there
>> is no shared history. Some efforts have gone into [documenting how to
>> port in-progress patches], but this doesn't attempt to discuss how to
>> handle more substantial out of tree work.
>>
>> Issues with integrating the prototype
>> -------------------------------------
>>
>> As far as I can tell, my options for trying to integrate with this
>> monorepo are fairly limited.
>>
>> If I merge my trees directly into the monorepo prototype at head, I end
>> up with two copies of every commit, one of which is a monorepo style
>> commit and one with the singular repo history. These commits are
>> completely unrelated to each other, and exist in two separate parallel
>> histories, making it difficult to correlate one to the other or even to
>> tell which is which.
>>
>> An arguably cleaner solution would be try to recreate all of my trees'
>> history artificially as if they were based on the monorepo prototype
>> history all along, but this has two problems. First, it's a very
>> significant tooling effort to do this - I'd need to match up several
>> years of merge points to their corresponding spots in the monorepo
>> prototype and somehow redo all of the merges in the same ways. Tools
>> like "rebase --preserve-merges" don't really help here, since they abort
>> on merge conflicts and ask a human to resolve them again. Even if I were
>> to come up with tooling that managed this, I'm still left with a
>> completely new set of hashes for commits and no easy way to map them to
>> existing references in emails, bug trackers, and release notes.
>>
>> Finally, there's the option of throwing away all of my history and
>> applying my out of tree work in a single patch. This makes git-log and
>> git-blame useless for investigating issues in my codebase for a few
>> years. It also means that when fixes go into older branches they can't
>> be merged forward and need to be redone by hand.
>>
>> All of these have very significant drawbacks, and none of them really
>> sounds like a good option at all.
>>
>> An alternative approach
>> -----------------------
>>
>> All of these problems could be mitigated if we could preserve the
>> history of the existing git mirrors when generating the monorepo. There
>> are two ways to do this.
>>
>> 1. Start the monorepo by subtree-merging the various repos together at
>> an arbitrary point in time.
>>
>> 2. "Zip" together the commits in each official git mirror repo by
>> merging them into a combined view after each commit.
>>
>> While I personally don't see a problem with (1), I've heard people claim
>> that they want to use the monorepo to bisect arbitrarily far back into
>> history. If this is the case, we'd prefer an approach like (2).
>>
>> A zippered repository gives us a lot of the benefits of the prototype,
>> without a lot of the issues that are caused by rewriting history:
>>
>> - The commits from the official git mirrors exist as they are now, and
>> we don't need to deal with changing hashes.
>>
>> - Out-of-tree branches have all of their history whether they opt in to
>> creating a monorepo style history or not
>>
>> - All of the repo's history is visible as a monorepo by looking only at
>> the merge commits. Bisect scripts can easily filter to these.
>>
>> - The monorepo commits and individual repo commits are easily
>> discernible and have a direct link between them in git's DAG, making
>> it easy to find one from the other.
>>
>> To demonstrate this approach, I've put up a snapshot of what LLVM might
>> look like if we did this, using some scripts that Duncan wrote a while
>> back to experiment with the idea:
>>
>> https://github.com/bogner/llvm-zipper-prototype
>>
>> Note that this is just a demo/prototype. It has some minor issues, isn't
>> being automatically updated, and I may regenerate it at some point.
>>
>> Thoughts?
>>
>> Thanks,
>> -- Justin Bogner
>>
>> [LLVM git monorepo prototype]: https://github.com/llvm-git-prototype/llvm
>> [official git mirrors]: https://git.llvm.org/git/llvm.git
>> [documenting how to port in-progress patches]:
>> https://reviews.llvm.org/D53414
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
More information about the llvm-dev
mailing list