[llvm-dev] [monorepo] Downstream branch zipping tool available

David Greene via llvm-dev llvm-dev at lists.llvm.org
Mon Nov 12 13:26:59 PST 2018

Building on the great work that James Knight did on
migrate-downstream-fork.py (Thanks, James!) [1], I've created a simple
tool to take migrated downstream fork branches and zip them into a
single history given a history containing submodule updates of
subprojects [2].

With migrate-downstream-fork.py, one is left with a set of unrelated
histories, one per subproject:

llvm                             clang                       compiler-rt
* V Add my fancy LLVM feature    * G Fix my dumb clang bug   * Z Merge from upstream compiler-rt

One can do an octopus merge to unify them:

  *-- Merge llvm, clang and compiler-rt
  |\ \
  * \ \  V Add my fancy LLVM feature
  |  * |  G Fix my dumb clang bug
  |  | *  Z Merge from upstream compiler-rt

Unfortunately, that doesn't show the logical history of development,
where changes were effectively applied to subprojects in a linear
fashion.  This makes it more difficult to do bisects, among other things
because none of the downstream integration happens until the octopus

Let's say that downstream you have a local mirror for each LLVM
subproject you work on.  Suppose also that you have an "umbrella"
repository that holds submodule references to all those local mirrors.
Various commits in the umbrella update submodule references:

  * Update llvm submodule to V
  * Update clang submodule to G
  * Don't update any submodules, fix scripts or something
  * Update compiler-rt submodule to Z

zip-downstream-fork.py will take these submodule updates and "inline"
them into the umbrella history, making it appear that the downstream
commits were applied against the monorepo in the order implied by the
umbrella history:

  * A Add my fancy LLVM feature
  * B Fix my dumb clang bug
  * C Merge from upstream compiler-rt

Parent relationships for merges from upstream are preserved, though as
top-level comments in zip-downstream-fork.py explain, the history graph
can look a little strange.  Commits that don't update submodules are
skipped on the assumption that they modify things uninteresting to a
monorepo history.  Such commits could be preserved but doing so has some
caveats as explained in the comments.  Perhaps your umbrella repository
holds your build scripts.  You'd probably want to migrate that to the
zipped history.  If there's strong demand for this I could look into
doing it.

There are various other limitations to the tool explained in the
comments.  It was enough to get us going and I'm hopeful it will be
useful for others.  It seems to do the right thing with our repositories
but YMMV.  Feel free to open PRs with bug fixes.  :)

To get this to work, you'll need to apply a PR for
migrate-downstream-fork.py to fix issues with --revmap-out [3].


[1] https://github.com/jyknight/llvm-git-migration/blob/master/migrate-downstream-fork.py
[2] https://github.com/jyknight/llvm-git-migration/pull/2/commits/a3b44a294c20f1762cb42b5794e6130c5b27f22d
[3] https://github.com/jyknight/llvm-git-migration/pull/1

More information about the llvm-dev mailing list