[llvm-dev] RFC: Dealing with out of tree changes and the LLVM git monorepo

Tom Stellard via llvm-dev llvm-dev at lists.llvm.org
Wed Oct 31 11:27:10 PDT 2018


On 10/31/2018 10:39 AM, Justin Bogner wrote:
> Tom Stellard <tstellar at redhat.com> writes:
>> On 10/31/2018 09:22 AM, Justin Bogner via llvm-dev wrote:
>>> Hi all,
>>>
>>> I've spent some time in the last couple of days trying to figure out how
>>> to adopt the [LLVM git monorepo prototype] for an out of tree backend.
>>> TLDR: I'm not convinced that this prototype is the right approach to
>>> converting to the monorepo, and I have a possible alternative.
>>>
>>
>> I think it's too late at this point to start considering alternative 
>> monorepo layouts.  We're already behind in getting the current monorepo
>> up and running, and I think discussing and implementing an alternative
>> will take too long and put our goal of moving off SVN by next year's
>> development meeting at risk.
> 
> The layout here is not at all different, only the process by which the
> repo is generated. I strongly believe that a history preserving
> conversion is very important if we want to avoid making porting
> out-of-tree work horribly disruptive.
> 

The process is actually what I'm concerned about here, much more so than
the physical layout of the repo.  It takes time to discuss, develop
and debug a new process for automatically syncing from SVN to a new git
repository.  We've already gone through all these steps with the existing
monorepo, so switching to something else at this point would be a step
backwards in my opinion.

-Tom

>> Is it possible that the monorepo you have proposed could be used as an
>> aide to people trying to integrate out-of-tree branches into the
>> current monorepo?
>> For example, would someone be able to merge their changes into your monorepo
>> and then cherry-pick them to the current monorepo?
> 
> Cherry picking out of tree branches is not at all practical. If I have a
> backend that's been in development for several years and has many
> merges, cherry picking doesn't help. We'd probably need a tool that
> regenerates the history "as-if" it had been done on the monorepo itself,
> but besides being fairly difficult to do that has it's own problems that
> I described below.
> 
>> -Tom
>>
>>> The main problems I'm running into stem from the fact that this
>>> prototype rewrites all of history from scratch rather than leverage the
>>> existing [official git mirrors]. This makes migrating out-of-tree work
>>> from the official git mirrors to this repo very difficult, since there
>>> is no shared history. Some efforts have gone into [documenting how to
>>> port in-progress patches], but this doesn't attempt to discuss how to
>>> handle more substantial out of tree work.
>>>
>>> Issues with integrating the prototype
>>> -------------------------------------
>>>
>>> As far as I can tell, my options for trying to integrate with this
>>> monorepo are fairly limited.
>>>
>>> If I merge my trees directly into the monorepo prototype at head, I end
>>> up with two copies of every commit, one of which is a monorepo style
>>> commit and one with the singular repo history. These commits are
>>> completely unrelated to each other, and exist in two separate parallel
>>> histories, making it difficult to correlate one to the other or even to
>>> tell which is which.
>>>
>>> An arguably cleaner solution would be try to recreate all of my trees'
>>> history artificially as if they were based on the monorepo prototype
>>> history all along, but this has two problems. First, it's a very
>>> significant tooling effort to do this - I'd need to match up several
>>> years of merge points to their corresponding spots in the monorepo
>>> prototype and somehow redo all of the merges in the same ways. Tools
>>> like "rebase --preserve-merges" don't really help here, since they abort
>>> on merge conflicts and ask a human to resolve them again. Even if I were
>>> to come up with tooling that managed this, I'm still left with a
>>> completely new set of hashes for commits and no easy way to map them to
>>> existing references in emails, bug trackers, and release notes.
>>>
>>> Finally, there's the option of throwing away all of my history and
>>> applying my out of tree work in a single patch. This makes git-log and
>>> git-blame useless for investigating issues in my codebase for a few
>>> years. It also means that when fixes go into older branches they can't
>>> be merged forward and need to be redone by hand.
>>>
>>> All of these have very significant drawbacks, and none of them really
>>> sounds like a good option at all.
>>>
>>> An alternative approach
>>> -----------------------
>>>
>>> All of these problems could be mitigated if we could preserve the
>>> history of the existing git mirrors when generating the monorepo. There
>>> are two ways to do this.
>>>
>>> 1. Start the monorepo by subtree-merging the various repos together at
>>>    an arbitrary point in time.
>>>
>>> 2. "Zip" together the commits in each official git mirror repo by
>>>    merging them into a combined view after each commit.
>>>
>>> While I personally don't see a problem with (1), I've heard people claim
>>> that they want to use the monorepo to bisect arbitrarily far back into
>>> history. If this is the case, we'd prefer an approach like (2).
>>>
>>> A zippered repository gives us a lot of the benefits of the prototype,
>>> without a lot of the issues that are caused by rewriting history:
>>>
>>> - The commits from the official git mirrors exist as they are now, and
>>>   we don't need to deal with changing hashes.
>>>
>>> - Out-of-tree branches have all of their history whether they opt in to
>>>   creating a monorepo style history or not
>>>
>>> - All of the repo's history is visible as a monorepo by looking only at
>>>   the merge commits. Bisect scripts can easily filter to these.
>>>
>>> - The monorepo commits and individual repo commits are easily
>>>   discernible and have a direct link between them in git's DAG, making
>>>   it easy to find one from the other.
>>>
>>> To demonstrate this approach, I've put up a snapshot of what LLVM might
>>> look like if we did this, using some scripts that Duncan wrote a while
>>> back to experiment with the idea:
>>>
>>>   https://github.com/bogner/llvm-zipper-prototype
>>>
>>> Note that this is just a demo/prototype. It has some minor issues, isn't
>>> being automatically updated, and I may regenerate it at some point.
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>> -- Justin Bogner
>>>
>>> [LLVM git monorepo prototype]: https://github.com/llvm-git-prototype/llvm
>>> [official git mirrors]: https://git.llvm.org/git/llvm.git
>>> [documenting how to port in-progress patches]: https://reviews.llvm.org/D53414
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>



More information about the llvm-dev mailing list