[llvm-dev] [monorepo] Downstream branch zipping tool available

David Greene via llvm-dev llvm-dev at lists.llvm.org
Tue Dec 18 07:45:19 PST 2018


Björn Pettersson A <bjorn.a.pettersson at ericsson.com> writes:

> We have used llvm as the umbrella repo, so in llvm we have a "master"
> branch (from the git single repo version of llvm) and a couple of
> downstream branches (let's call them "down0", "down1") containing our
> downstream work (with frequent merges from "master").

Ok.

> The downstream branches has tools/clang and runtimes/compiler-rt as
> submodules, as well as a couple of downstream submodules.

Ok.

> In our downstream version of clang we have a similar structure.
> A "master" branch (mapping to the git single repo version clang),
> and a couple of downstream branches. The downstream branches has
> tools/extra (i.e. clang-tools-extra) as a submodule.

So the clang submodule in llvm has a submodule itself?  I wasn't even
aware that was possible.

> I can also mention that the clang, compiler-rt and clang-tools-extra
> submodules aren't present from the beginning of history. They have
> been added later on.

That shouldn't be a problem for the script.  We have the same sort of
history.

> I doubt that zip-downstream-fork.py will work out-of-the-box.
> Hopefully I'll be able to patch it for our scenario. Any guidelines
> might be helpful. But maybe it isn't even worth trying to adapt
> zip-downstream-fork.py to do something useful for our scenario?

Yeah, non-submodule-update commits in the llvm repository would be
droppped per this comment:

# - The script assumes that any commits in the umbrella history that
#   do not update submodules should be discarded.  It is not clear
#   what should happen if such a commit happens to touch files with
#   the same name as those in the monorepo (README files are typical).
#   Adding support to keep these commits should be straightforward,
#   but because decisions are likely to vary based on particular
#   setups, we just punt for now.

This happens around line 288 in zip-downstream-fork.py:

    if self.prev_submodules == submodules:
      # This is a commit that modified some file in the umbrella and
      # didn't update any submodules..  Assume we don't want it.
      self.debug('No submodule updates')
      return self.substitute_commit(commit, githash)

If you return commit here instead of doing substitute_commit it should
retain the commit unaltered.  That's not quite what you want for the
monorepo, you want commits to llvm to appear under the llvm directory in
the monorepo.  The code to do that is in migrate-downstream-fork.py
arount line 106 in commit_filter:

    # OK -- NOT an upstream commit: move the tree under the correct subdir, and
    # preserve everything outside that subdir.  The tricky part is figuring out
    # *which* parent to get the rest of the tree (other than the named subproject)
    # from, in case of a merge.

You could try to copy this verbatim into zip-downstream-fork.py or it
could be factored out into a common library.  If a significant number of
people have a setup similar to yours, it may very well be worth doing
that.  You'd also need to add the check for upstream commits.

Now that I think about it, what you really want is something that runs
migrate-downstream-fork.py on the commits in llvm and something that
runs zip-downstream-fork.py on commits in other projects, but they have
to ruin simultaneously to keep the commits in the proper order.  If both
migrate-downstream-fork.py and zip-downstream-fork.py were refactored to
put most of their code in a package/library, then a third tool could be
created to do what you need.  Obviously, that will take some work to
accomplish.  You'd also want James' guidance on changing
migrate-downstream-fork.py.  There are certain enhancements to
zip-downstream-fork.py that I didn't make because I didn't want to mess
with migrate-downstream-fork.py (see the comments at the top of
zip-downstream-fork.py).

zip-downstream-fork.py also doesn't consider submodules of other
submodules.  You can maybe get that to work by altering how
find_submodules looks for submodule commits.  It would have to recurse
over the submodules it finds.

> If someone else got a similar scenario, let me know. Perhaps we can
> do some joint effort in adapting the zipper script.

Unfortunately, I don't have any bandwidth to hack on this right now.
I'm happy to answer questions, though.

                               -David


More information about the llvm-dev mailing list