[llvm-dev] [RFC] One or many git repositories?

Simon Taylor via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 26 04:08:00 PDT 2016

> On 26 Jul 2016, at 10:15, Renato Golin <renato.golin at linaro.org> wrote:
> On 26 July 2016 at 10:09, Simon Taylor via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Thus downstream developers can continue to use the read-only view of the independent projects if that is easier for them; but people hacking on llvm/clang itself get the benefits of easier checkout, patching, bisection, atomic commits between projects, etc that come from using a monorepo as the official repository.
> Would this read-only repositories remain with the synchronous version
> stream? I think this was one of the points against pure-git without
> sub-modules and without monolithic repository.

Which “synchronous version stream” are you referring to?

My understanding is that currently the official repos are in SVN but are separate for each project.

That situation could of course be recreated exactly with git.

The downside is that some of the projects have cross-dependencies (clang rev x will only work with llvm rev y) and I don’t believe these cross-repository dependencies are currently stored anywhere.

If my understanding is mistaken, then apologies, I don’t do any day-to-day work with LLVM.

git submodules would let you add an umbrella repository that would ensure the submodules were at mutually compatible versions.

An alternative is to use a monorepo as the ultimate source of truth and the “official upstream”, which would ensure all projects are mutually compatible, and make cross-project patches / bisection etc easier.

With a monorepo upstream it would still be possible to maintain read-only views of parts of the repository (ie individual projects) by projecting commits from the monorepo.

Say a patch is committed to the monorepo that touches libc++, clang, and llvm. Those 3 individual read-only repos would then get be updated with the changes in the commit that affects their files. The commit message would be the same as from the monorepo, but would have a line added referencing the monorepo commit (in the same way the git repos currently list the svn rev in their commit messages). It would also be possible to maintain a read-only umbrella repo that references the individual ones as submodules; that would also receive a commit updating the versions of the individual git repos.

These read-only projections of the monorepo would be entirely deterministic - every commit in the monorepo would generate a matching commit in any project that it touches (and a commit in the umbrella submodule-based repo too if desired). The read-only views could be regenerated from scratch from the monorepo [and optionally a starting state, so you could keep the existing hashes in the current individual git repo views].

Whether or not to go with a monorepo really depends for me on how intertwined the modules are, and how often cross-repo commits happen. That’s not something I know personally, so I won’t make any recommendations either way.


More information about the llvm-dev mailing list