[llvm-dev] [RFC] One or many git repositories?

Justin Lebar via llvm-dev llvm-dev at lists.llvm.org
Sun Jul 31 00:06:34 PDT 2016


> And if it is, then the "only thing a monorepo gets you" isn't something that you need a monorepo to get.

This is an *extremely important* point to understand, so let me try to
be really clear about the current state of the world and the state of
the world under the two "move to git" proposals.

Today, all commits ultimately end up in SVN.  Our SVN is a effectively
a monorepo, so today, a single commit can touch multiple subprojects.
How you get the commit into SVN is your business.  Maybe you can hack
git-svn somehow to do the atomic commit.  (If this is possible, it's
beyond my ken.)  Alternatively you can just commit via SVN.  If you're
a git user, I wrote a hacky script [1] that cherry-picks commits from
the existing monorepo mirror and commits them via SVN.  It's annoying
to do, but it is possible today to atomically commit to multiple
subprojects, as you observed.

Under the monorepo proposal, this becomes much easier.  It's just "git
commit", no magic.

Under the multirepo git proposal, this becomes either impossible or
much more complicated.  Under the proposal, we have separate git
repositories for each subproject, and we push directly to these.
There's then an umbrella repository, which includes the subproject
repos as git submodules.  There's a script which periodically checks
the subproject repos for updates.  When it sees an update, it creates
a new commit in the umbrella repository.  The script is the only thing
that can create commits in the umbrella repo.

In order to get atomic commits in the multirepo world, we would need
some way to inform the script that two otherwise separate commits
should appear in the umbrella repo as a single commit.  We'd probably
need to agree on a protocol communicated via commit messages.  We'd
also probably need client-side scripts to set the commit messages
appropriately.

I expect this would be so much of a hassle, even if we managed to
implement it on the server side, it would be prohibitively complex for
most users.

In addition, under the multirepo, you only get synchronized subproject
commits in your local checkout if you choose to use a git-submodules
based workflow.  If you use the workflow that we currently have, then
on the client side, there is no guarantee that your subprojects will
be sync'ed.  (This is the same as most peoples' client-side git
workflows today.)  *Even if we manage to atomically commit across
subprojects*, that is of limited utility unless those commits show up
atomically on developers' workstations.  But using a workflow based on
git-submodules is highly complex as compared to the monorepo -- this
was what I was trying to illustrate in my very first email on this
thread.

When we say "the monorepo gets you atomic commits," that's an abbreviation for

1) The monorepo makes it far simpler to make atomic commits from git
as compared to the current SVN setup.
2) Atomic commits are definitely possible in the monorepo.  They are
theoretically possible in the multirepo, with extensive tooling etc.
3) Under the basic monorepo workflow, your checkouts are always
correct with respect to atomic commits.  Under the basic multirepo
workflow, this is not true -- you have to engage with git submodules
to get this property, and that is a giant pain.

Sorry for the wall of text, but this is important.

[1] https://github.com/jlebar/llvm-repo-tools.  Be careful, I've only
made one commit with it so far.  :)

On Sat, Jul 30, 2016 at 10:38 PM, Robinson, Paul <paul.robinson at sony.com> wrote:
>> The only thing a monorepo gets you that strictly isn’t possible without
>> it is the ability to commit to multiple projects in a single commit.
>> Personally I don’t think that is a big enough justification, but that is
>> my opinion, not a fact.
>
> Okay, I just bumped into r277008, in which commits to llvm, clang, and
> clang-tools-extra all have the same SVN revision number.
> I don't know how it happened but it did.  Is this just an artifact of
> how somebody pasted together a bunch of git-svn projects, or is it
> something that a top-level git repo with submodules would allow?
> And if it is, then the "only thing a monorepo gets you" isn't something
> that you need a monorepo to get.
> Your befuddled correspondent,
> --paulr
>


More information about the llvm-dev mailing list