[llvm-dev] RFC: Dealing with out of tree changes and the LLVM git monorepo

James Y Knight via llvm-dev llvm-dev at lists.llvm.org
Mon Nov 5 23:14:59 PST 2018


As promised, here's a tool for migrating downstream forks of the split
subproject repositories (e.g. https://github.com/llvm-mirror/**) into a
monorepo. It could be extended to handle conversion of a previous monorepo
into the current one, too, but it doesn't currently.

(BTW -- just a reminder that the "llvm-git-prototype" repositories are NOT
FINAL yet and may still change. Until it is finalized, please don't make
repositories based on it, other than for demonstration/testing purposes.)

To use this tool, you must first pull together a git repository which
includes the upstream monorepo, the relevant upstream split repositories,
and any number of your private split repositories. Then, run the script to
rewrite all the specified branches (destructively, in-place!), as if you'd
made the commits on top of the monorepo.

Note -- this does not _interleave_ commits made to different subproject
forks/branches. E.g., If you start with a branch of the "llvm" subproject
repo and a branch of the "clang" subproject repo, after running this,
you'll still have two branches. But they'll now both be based on a single
upstream repository, and one will have your changes to the llvm/ subdir,
and one will have your changes to the "clang" subdir. (if you wish to merge
them afterwards, you may)

A known issue is that the script will give up if given a commit-graph which
merges from multiple upstream svn branches, as that could cause conflicts
in subprojects other than the one you've forked and resolved in the merge
commit within. I'm not sure if anyone will run into that, but if they do,
some heuristics can likely be added to handle it (e.g., throw out the old
release branch's changes, just keep the new).

The tool is available here:
https://github.com/jyknight/llvm-git-migration/blob/master/migrate-downstream-fork.py

I've tested it on the CHERI project, and it took about a minute to run
'migrate-downstream-fork' itself. (Pulling all the various source
repositories took longer, of course.)

This result is here, for now (again, only for demonstration/testing!):
https://github.com/jyknight/CHERI-monorepo-prototype

To that, I've only uploaded a single branch, "cheri", made from git merge
on 3 of cheri's subproject repositories (llvm, clang, lld) after the
migration. The other subprojects that CHERI forked have not been pulled up
to a consistent revision in a long time, so can't be merged into a
consistent target branch. I didn't look into any of the other branches in
their repositories.

But, all branches of all of them did get migrated by the tool, and could be
uploaded (either as-is, to their own branches, or after making appropriate
merges between the subproject branches, whichever is preferable.)

Here's what I ran:

#!/bin/sh

set -exu


mkdir cheri-monorepo

cd cheri-monorepo

git init


git remote add monorepo https://github.com/llvm-git-prototype/llvm.git


for x in llvm lld clang lldb libunwind libcxx; do

  git remote add split/$x https://github.com/llvm-mirror/$x

  git remote add cheri/$x https://github.com/CTSRD-CHERI/$x

done


git fetch --all


../llvm-git-migration/migrate-downstream-fork.py refs/remotes/cheri
refs/tags


git checkout -b cheri $(git merge-base refs/remotes/monorepo/master
refs/remotes/cheri/llvm/master)

git merge refs/remotes/cheri/{llvm,clang,lld}/master


On Mon, Nov 5, 2018 at 1:53 PM David Greene <dag at cray.com> wrote:

> Well shoot, you beat me to it.  :)
>
> I've been working on a similar tool but it's not ready yet.  Looking
> forward to trying yours!
>
>                             -David
>
> James Y Knight via llvm-dev <llvm-dev at lists.llvm.org> writes:
>
> > I'm about to post exactly this tool -- I've been testing it on the
> > CHERI forks of llvm/clang/lld (lots of history and merges and stuff
> > there, makes a pretty nice test case!)
> >
> > On Mon, Nov 5, 2018 at 1:07 PM David Greene via llvm-dev
> > <llvm-dev at lists.llvm.org> wrote:
> >
> >     Mehdi AMINI <joker.eph at gmail.com> writes:
> >
> >     > Yes, but that's the case for the zipper repo anyway: one merge
> >     per
> >     > commit. The point is that the second commit is just a trivial
> >     merge,
> >     > it wouldn't show up in a file `git log` for example.
> >     > In the linear rewritten monorepo, adding the history taken from
> >     the
> >     > existing git mirror would lead to duplicated commits, as in
> >     > *identical* commit / commit with the same diff but different git
> >     > hashes. I'd expect git log to show us the two commits in the git
> >     log
> >     > of a single file.
> >
> >     Would it be valuable to have a tool to take branches from existing
> >     git
> >     mirrors and rewrite them in terms of the monorepo so there would
> >     be no
> >     duplicate commits and everything would appear to have been done
> >     against
> >     the monorepo?
> >
> >     I know Justin is worried about hashes in old e-mails being
> >     invalid, but
> >     the tool could include the mapping from old hash to new hash in
> >     the
> >     commit message. Of course that would only be done for local
> >     downstream
> >     commits, as the monorepo commits were already rewritten without
> >     including that information. Would it be helpful to have the
> >     monorepo
> >     commits contain that information?
> >
> >     -David
> >     _______________________________________________
> >     LLVM Developers mailing list
> >     llvm-dev at lists.llvm.org
> >     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181106/ccf97221/attachment.html>


More information about the llvm-dev mailing list