[llvm-dev] RFC: Dealing with out of tree changes and the LLVM git monorepo

Justin Bogner via llvm-dev llvm-dev at lists.llvm.org
Wed Oct 31 10:39:50 PDT 2018


Tom Stellard <tstellar at redhat.com> writes:
> On 10/31/2018 09:22 AM, Justin Bogner via llvm-dev wrote:
>> Hi all,
>> 
>> I've spent some time in the last couple of days trying to figure out how
>> to adopt the [LLVM git monorepo prototype] for an out of tree backend.
>> TLDR: I'm not convinced that this prototype is the right approach to
>> converting to the monorepo, and I have a possible alternative.
>> 
>
> I think it's too late at this point to start considering alternative 
> monorepo layouts.  We're already behind in getting the current monorepo
> up and running, and I think discussing and implementing an alternative
> will take too long and put our goal of moving off SVN by next year's
> development meeting at risk.

The layout here is not at all different, only the process by which the
repo is generated. I strongly believe that a history preserving
conversion is very important if we want to avoid making porting
out-of-tree work horribly disruptive.

> Is it possible that the monorepo you have proposed could be used as an
> aide to people trying to integrate out-of-tree branches into the
> current monorepo?
> For example, would someone be able to merge their changes into your monorepo
> and then cherry-pick them to the current monorepo?

Cherry picking out of tree branches is not at all practical. If I have a
backend that's been in development for several years and has many
merges, cherry picking doesn't help. We'd probably need a tool that
regenerates the history "as-if" it had been done on the monorepo itself,
but besides being fairly difficult to do that has it's own problems that
I described below.

> -Tom
>
>> The main problems I'm running into stem from the fact that this
>> prototype rewrites all of history from scratch rather than leverage the
>> existing [official git mirrors]. This makes migrating out-of-tree work
>> from the official git mirrors to this repo very difficult, since there
>> is no shared history. Some efforts have gone into [documenting how to
>> port in-progress patches], but this doesn't attempt to discuss how to
>> handle more substantial out of tree work.
>> 
>> Issues with integrating the prototype
>> -------------------------------------
>> 
>> As far as I can tell, my options for trying to integrate with this
>> monorepo are fairly limited.
>> 
>> If I merge my trees directly into the monorepo prototype at head, I end
>> up with two copies of every commit, one of which is a monorepo style
>> commit and one with the singular repo history. These commits are
>> completely unrelated to each other, and exist in two separate parallel
>> histories, making it difficult to correlate one to the other or even to
>> tell which is which.
>> 
>> An arguably cleaner solution would be try to recreate all of my trees'
>> history artificially as if they were based on the monorepo prototype
>> history all along, but this has two problems. First, it's a very
>> significant tooling effort to do this - I'd need to match up several
>> years of merge points to their corresponding spots in the monorepo
>> prototype and somehow redo all of the merges in the same ways. Tools
>> like "rebase --preserve-merges" don't really help here, since they abort
>> on merge conflicts and ask a human to resolve them again. Even if I were
>> to come up with tooling that managed this, I'm still left with a
>> completely new set of hashes for commits and no easy way to map them to
>> existing references in emails, bug trackers, and release notes.
>> 
>> Finally, there's the option of throwing away all of my history and
>> applying my out of tree work in a single patch. This makes git-log and
>> git-blame useless for investigating issues in my codebase for a few
>> years. It also means that when fixes go into older branches they can't
>> be merged forward and need to be redone by hand.
>> 
>> All of these have very significant drawbacks, and none of them really
>> sounds like a good option at all.
>> 
>> An alternative approach
>> -----------------------
>> 
>> All of these problems could be mitigated if we could preserve the
>> history of the existing git mirrors when generating the monorepo. There
>> are two ways to do this.
>> 
>> 1. Start the monorepo by subtree-merging the various repos together at
>>    an arbitrary point in time.
>> 
>> 2. "Zip" together the commits in each official git mirror repo by
>>    merging them into a combined view after each commit.
>> 
>> While I personally don't see a problem with (1), I've heard people claim
>> that they want to use the monorepo to bisect arbitrarily far back into
>> history. If this is the case, we'd prefer an approach like (2).
>> 
>> A zippered repository gives us a lot of the benefits of the prototype,
>> without a lot of the issues that are caused by rewriting history:
>> 
>> - The commits from the official git mirrors exist as they are now, and
>>   we don't need to deal with changing hashes.
>> 
>> - Out-of-tree branches have all of their history whether they opt in to
>>   creating a monorepo style history or not
>> 
>> - All of the repo's history is visible as a monorepo by looking only at
>>   the merge commits. Bisect scripts can easily filter to these.
>> 
>> - The monorepo commits and individual repo commits are easily
>>   discernible and have a direct link between them in git's DAG, making
>>   it easy to find one from the other.
>> 
>> To demonstrate this approach, I've put up a snapshot of what LLVM might
>> look like if we did this, using some scripts that Duncan wrote a while
>> back to experiment with the idea:
>> 
>>   https://github.com/bogner/llvm-zipper-prototype
>> 
>> Note that this is just a demo/prototype. It has some minor issues, isn't
>> being automatically updated, and I may regenerate it at some point.
>> 
>> Thoughts?
>> 
>> Thanks,
>> -- Justin Bogner
>> 
>> [LLVM git monorepo prototype]: https://github.com/llvm-git-prototype/llvm
>> [official git mirrors]: https://git.llvm.org/git/llvm.git
>> [documenting how to port in-progress patches]: https://reviews.llvm.org/D53414
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 


More information about the llvm-dev mailing list