[llvm-dev] RFC: Dealing with out of tree changes and the LLVM git monorepo

James Y Knight via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 2 09:58:50 PDT 2018


Thanks for writing this up. I think it's a really important point which
deserves discussion.

Ultimately, I think it is a question as to whether to prioritize the easy
switchover for existing out of tree forks, or to prioritize having the best
conversion we can make. I feel very strongly that the latter should be the
priority for the official repository conversion, and that, therefore, we
should not use the zipper method for the official repository going forward.

However, it's also worth putting much thought into making switchover as
easy as possible within the confines of what's possible given that
prioritization.

On Wed, Oct 31, 2018 at 12:22 PM Justin Bogner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> An arguably cleaner solution would be try to recreate all of my trees'
> history artificially as if they were based on the monorepo prototype
> history all along, but this has two problems. First, it's a very
> significant tooling effort to do this - I'd need to match up several
> years of merge points to their corresponding spots in the monorepo
> prototype and somehow redo all of the merges in the same ways. Tools
> like "rebase --preserve-merges" don't really help here, since they abort
> on merge conflicts and ask a human to resolve them again.


I realized I had most of the functionality needed for this already written
for such a conversion tool, so I've written a tool which is able to
(mostly!) convert a single-project repository (with all its commits and
merges), into a monorepo repository (with the same commits and merges). The
transform is conceptually trivial -- take the subproject's tree from the
old commit, and take the rest of the content from the monorepo parent.
That's perfect -- no need to deal with any conflict resolution, UNLESS
there are potential merge conflicts in the parts of the tree OUTSIDE the
original repository's subproject.

As the original repository will -- by definition -- not touch the other
directories, such a conflict can only happen if you have merges between
upstream-svn release branches in your history. E.g., if, in your fork of
clang.git, you started working from the release_50 branch, then
(potentially after a bunch of work), merged the release_60 branch. In your
clang fork, you of course had to resolve any conflicts in clang, but would
NOT have resolved conflicts between release_50 and release_60 in "llvm" or
other subprojects. The tool can't necessarily know what to do here either.

Now, in that case, it's pretty likely that you'd want to just take the
release_60 tree as is, throwing out the changes that happened only on the
release_50 branch. So, if this seems useful, I can imagine adding some
heuristics or manual override to support that particular case.

I'll post the tool soon. It could also be extended to support conversions
from the previous monorepo repositories to make that easier for folks too.

Even if I were
> to come up with tooling that managed this, I'm still left with a
> completely new set of hashes for commits and no easy way to map them to
> existing references in emails, bug trackers, and release notes


*Creating* such a commit mapping is certainly easy.

[....]


One more option -- which I've not yet tried, but seems like it could be
really promising -- would be to have _your_ repository's history have a
different shape from everyone else's, but still keep the same commit hashes
at head, going forward. Of course "That's impossible!" -- editing the
history will necessarily change the hash! But, actually, you can pull this
off using "replace" refs (see "git replace").

Start with the git merge you already created (merging all your
split-repositories into one branch on top of a monorepo-prototype commit).
Then, "git replace" the monorepo-prototype commit that you merged in with a
commit that has the same content, but from your "zippered" repository
history. That won't change the hash (thus, future merging will work
properly), but it effectively changes the history to be the way you'd like
to see it.

Thus you'll see the zipped history up until that point, avoiding seeing
multiple copies of svn commits in your history, and you can use the new
monorepo commits going forward.

(One note -- users would need to fetch the replace ref after cloning a new
repo (e.g., with `git fetch origin refs/replace/*:refs/replace/*`), since
clone won't fetch it automatically. If they forgot to, they'd simply see
the "normal" history, rather than the zipper history.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181102/aa51bc52/attachment-0001.html>


More information about the llvm-dev mailing list