[libcxx-dev] [monorepo] Much improved downstream zipping tool available

David Greene via libcxx-dev libcxx-dev at lists.llvm.org
Wed Jan 30 13:43:57 PST 2019


Björn Pettersson A <bjorn.a.pettersson at ericsson.com> writes:

> In llvm (split) we have:
>
>   UL4->UL3->UL2->UL1->UL0
>                    \
>          ...->DL2->DL1
>
> In clang (split) we have:
>
>   UC4->UC3->UC2->UC1->UC0
>                    \
>          ...->DC2->DC1
>
>
> DL1 is a commit that updates the clang submodule to DC1 (and in this
> scenario at the same time merges UL1 and DL2 in llvm).

Ok, in that case I would expect the resulting history to look like this:

    UL4->UC2->UL3->UL2->UL1->UL0->UC1 <- monorepo/master
                         |         \
                         \          `---.
                          `------------. \
                                        \|
                            ... ->DL2->DL1/DC2 <- zip/master
                                        /
                            ... ->DC2--'

As a submodule update, DC1 is "inlined" into DL1 and its commit message
is appended to that of DL1.  I'm presuming here that llvm never updated
the clang submodule to DC2, so it remains an independent commit.

The inlining is done assuming that submodule updates represent a single
logical change.  Submodule updates are assumed to be related to whatever
changes happen in the umbrella so they all get smushed together into one
commit.

The edge UC1->DL1 represents the use of UC1 tree for every project
*except* llvm, because clang was a submodule of llvm (and updated to DC1
which merged UC1) and no other project was a submodule in llvm.  DL1
still has the llvm tree from UL1 plus any local changes you may have
made.

Admittedly, this is tricky to understand.  Believe me, there were a lot
of headaches involved trying to figure out what the right thing to do
is.  This is my best stab at that.

I don't think I have a test that creates this kind of graph.  It would
be interesting to see if it works.  :) At the moment I'm busy with other
things.  Give it a try and see if it does what you expect.

> How does git know that it should follow the parent relation from
> DL1 to UL1 for the llvm subdir, and not the UL0->UC1->DC1->DL1
> path? I mean, if I check out commit DC1 I will see the contribution
> from UL0 in the llvm subdir, and DL1 includes the changes from DC1.

With the history above this is no longer an issue since you can't check
out DC1 as such.  It's related to the llvm tree in DL1.

Let's say we have a commit DC3 and commit DL3 updated llvm's clang
submodule to DC3.  Commit DC4 was never referenced in a submodule
update.  The graph should then look like this:

    UL4->UC2->UL3->UL2->UL1->UL0->UC1 <- monorepo/master
                         |         \
                         \          `-------.
                          `----------------. \
                                            \|
                       ... ->DL3/DC3->DL2->DL1/DC1 <- zip/master
                             /\             /
                 ... ->DC4--'  `--->DC2----'

DC3 is related to DL3 so it got inlined.  DC2 has an llvm tree based on
DL3.

Hopefully, this is now clear as mud.  :)

                             -David


More information about the libcxx-dev mailing list