[llvm-dev] RFC: Dealing with out of tree changes and the LLVM git monorepo

Chris Bieneman via llvm-dev llvm-dev at lists.llvm.org
Thu Nov 1 11:08:32 PDT 2018


Agreed. I also would argue that this problem isn't unique to out-of-tree backends. Generally it could impact any fork that has out-of-tree changes. I think out-of-tree backends is probably the most common type of use case for that, however it will also likely impact a variety of forks of LLVM projects. For example this will likely have impact on the Swift project's forks of LLVM & Clang which have out-of-tree modifications.

-Chris

> On Nov 1, 2018, at 11:00 AM, paul.robinson at sony.com wrote:
> 
> While my team doesn't have one, it's clear that out-of-tree backends are an important long-standing valuable use-case for downstream consumers of LLVM, and the new monorepo should try very hard NOT to make their lives difficult.
> --paulr
>   <>
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Chris Bieneman via llvm-dev
> Sent: Thursday, November 01, 2018 1:27 PM
> To: llvm-dev
> Subject: Re: [llvm-dev] RFC: Dealing with out of tree changes and the LLVM git monorepo
>  
> I just want to point out that the issue of incompatible history is not new. This has been getting discussed all the way back in July 2016.
>  
> http://lists.llvm.org/pipermail/llvm-dev/2016-July/102657.html <http://lists.llvm.org/pipermail/llvm-dev/2016-July/102657.html>
>  
> As James said in that email:
>  
> That we'll be getting incompatible history has been glossed over, and it is
> indeed really important to make it clear and have a good plan there. This
> doesn't only affect actual "forks", it also affects every single developer
> with a local git clone which contains unfinished work.
>  
> So, what is the plan with the existing mono-repo implementation? If there isn't one, then we should strongly consider alternative implementations of the mono-repo.
>  
> I also strongly believe we should not allow a schedule to force us to ignore significant problems in the proposals and implementations. Especially ones that we've known about for years.
>  
> -Chris
> 
> 
> On Nov 1, 2018, at 6:27 AM, Alexander Richardson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>  
> On Thu, 1 Nov 2018 at 08:45, Mikael Holmén via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
> 
> Hi,
> 
> Thanks for starting this discussion Justin!
> 
> On 10/31/18 5:22 PM, Justin Bogner via llvm-dev wrote:
> 
> Hi all,
> 
> I've spent some time in the last couple of days trying to figure out how
> to adopt the [LLVM git monorepo prototype] for an out of tree backend.
> TLDR: I'm not convinced that this prototype is the right approach to
> converting to the monorepo, and I have a possible alternative.
> 
> The main problems I'm running into stem from the fact that this
> prototype rewrites all of history from scratch rather than leverage the
> existing [official git mirrors]. This makes migrating out-of-tree work
> from the official git mirrors to this repo very difficult, since there
> is no shared history. Some efforts have gone into [documenting how to
> port in-progress patches], but this doesn't attempt to discuss how to
> handle more substantial out of tree work.
> 
> Issues with integrating the prototype
> -------------------------------------
> 
> As far as I can tell, my options for trying to integrate with this
> monorepo are fairly limited.
> 
> If I merge my trees directly into the monorepo prototype at head, I end
> up with two copies of every commit, one of which is a monorepo style
> commit and one with the singular repo history. These commits are
> completely unrelated to each other, and exist in two separate parallel
> histories, making it difficult to correlate one to the other or even to
> tell which is which.
> 
> An arguably cleaner solution would be try to recreate all of my trees'
> history artificially as if they were based on the monorepo prototype
> history all along, but this has two problems. First, it's a very
> significant tooling effort to do this - I'd need to match up several
> years of merge points to their corresponding spots in the monorepo
> prototype and somehow redo all of the merges in the same ways. Tools
> like "rebase --preserve-merges" don't really help here, since they abort
> on merge conflicts and ask a human to resolve them again. Even if I were
> to come up with tooling that managed this, I'm still left with a
> completely new set of hashes for commits and no easy way to map them to
> existing references in emails, bug trackers, and release notes.
> 
> Finally, there's the option of throwing away all of my history and
> applying my out of tree work in a single patch. This makes git-log and
> git-blame useless for investigating issues in my codebase for a few
> years. It also means that when fixes go into older branches they can't
> be merged forward and need to be redone by hand.
> 
> All of these have very significant drawbacks, and none of them really
> sounds like a good option at all.
> 
> 
> We're in this situation. We have over 7 years of git history for our
> out-of-tree target and it would be a huge pain and drawback if we were
> to lose that history by e.g. needing to apply all our changes as a
> single patch to the new monorepo.
> 
> We haven't started moving to the monorepo yet so while we haven't hit
> the issues in practice yet, we will. Preserving the history from the git
> mirrors would surely be beneficial.
> 
> 
> We are also in the same situation for our out-of-tree CHERI backend
> (https://github.com/CTSRD-CHERI/llvm <https://github.com/CTSRD-CHERI/llvm>
> https://github.com/CTSRD-CHERI/clang <https://github.com/CTSRD-CHERI/clang>
> https://github.com/CTSRD-CHERI/lld <https://github.com/CTSRD-CHERI/lld>). I am aware there were some
> attempts at converting our repos to a monorepo structure a few years
> ago according to
> <http://lists.llvm.org/pipermail/llvm-dev/2016-July/102787.html <http://lists.llvm.org/pipermail/llvm-dev/2016-July/102787.html>>.
> However, I'm not sure if the script mentioned there can be reused with
> the new git monorepo and it seems that it only handles clang. We would
> have to also include our forks of llvm,lld,libunwind and libc++.
> 
> Thanks,
> Alex
> 
> 
> An alternative approach
> -----------------------
> 
> All of these problems could be mitigated if we could preserve the
> history of the existing git mirrors when generating the monorepo. There
> are two ways to do this.
> 
> 1. Start the monorepo by subtree-merging the various repos together at
>    an arbitrary point in time.
> 
> 2. "Zip" together the commits in each official git mirror repo by
>    merging them into a combined view after each commit.
> 
> While I personally don't see a problem with (1), I've heard people claim
> that they want to use the monorepo to bisect arbitrarily far back into
> history. If this is the case, we'd prefer an approach like (2).
> 
> A zippered repository gives us a lot of the benefits of the prototype,
> without a lot of the issues that are caused by rewriting history:
> 
> - The commits from the official git mirrors exist as they are now, and
>   we don't need to deal with changing hashes.
> 
> - Out-of-tree branches have all of their history whether they opt in to
>   creating a monorepo style history or not
> 
> - All of the repo's history is visible as a monorepo by looking only at
>   the merge commits. Bisect scripts can easily filter to these.
> 
> - The monorepo commits and individual repo commits are easily
>   discernible and have a direct link between them in git's DAG, making
>   it easy to find one from the other.
> 
> To demonstrate this approach, I've put up a snapshot of what LLVM might
> look like if we did this, using some scripts that Duncan wrote a while
> back to experiment with the idea:
> 
>   https://github.com/bogner/llvm-zipper-prototype <https://github.com/bogner/llvm-zipper-prototype>
> 
> I took a quick look at the zipper prototype and I think it looks awesome!
> 
> (Then unfortunately gitk flipped out and after 40 minutes it ate 42GB of
> memory (and continued grabbing more) but I don't know if that's a
> problem that is perhaps solved in a more recent git version than I'm
> running or what the problem really is.)
> 
> Thanks,
> Mikael
> 
> 
> 
> Note that this is just a demo/prototype. It has some minor issues, isn't
> being automatically updated, and I may regenerate it at some point.
> 
> Thoughts?
> 
> Thanks,
> -- Justin Bogner
> 
> [LLVM git monorepo prototype]: https://github.com/llvm-git-prototype/llvm <https://github.com/llvm-git-prototype/llvm>
> [official git mirrors]: https://git.llvm.org/git/llvm.git <https://git.llvm.org/git/llvm.git>
> [documenting how to port in-progress patches]: https://reviews.llvm.org/D53414 <https://reviews.llvm.org/D53414>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181101/d5207e99/attachment.html>


More information about the llvm-dev mailing list