[llvm-dev] Git Transition status?

David Chisnall via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 17 07:24:00 PST 2017

On 17 Jan 2017, at 01:17, Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> - Monorepo is the “natural” way to use git.  Submodules are possible to use, but add significant complexity.

Having used submodules in a couple of projects, I’ve not found them to cause more difficulty than they avoided; however, they do have an issue specifically with GitHub, which is that tarballs don’t include submodules so packages are slightly harder to construct (they must point to two releases).

> - The download size of a mono-repo is manageable, and seems scalable for a project the size of LLVM (including reasonable growth over the next 10 years).

The download size of a mono-repo is fine for anyone who would be checking out LLVM today.  compiler-rt and libc++ are both useful without any of the rest of LLVM and contributors to libc++ rarely check out anything more than libc++ (perhaps libc++abi) today.

> - As Medhi says, according to surveys and discussions in forums like the LLVM Dev Meeting BoF, most people who care are in favor of mono-repo.

From the online surveys, I think the split was roughly 50:50.  I’d be very hesitant to regard anything at a BoF as representative of the wider community, as the set of people who have the time and funding to attend a conference is quite distinct from the wider community (particularly for the US DevMeeting, which is right in the middle of university term times).  We’ve made this mistake in FreeBSD before.

> - The people most impacted by mono-repo are those who want to build just compiler-rt.  We want these people to be happy, but they are very few in number, and their benefit needs to be balanced against the benefit for the larger community that builds llvm (and typically clang or another front end).

I believe that the big win for the monorepo is the ability to bisect usefully.  It’s currently very difficult to bisect clang, because you can’t bisect clang and llvm independently (LLVM API changes frequently break clang) and they’re in different git repos (or non-enclosing svn subtrees) and so it needs some manual intervention.  Having them in the same repo would ensure that they are in sync and make bisecting trivial.

In contrast, there is not (and should not be) tight coupling between LLVM and libc++, libunwind, libc++abi, and compiler-rt.  There *may* be ordering requirements (e.g. revision X of libc++ requires c++17 features of revision Y of clang for c++17 features to work), but it is incredibly valuable to bisect these independently to find whether a particular change is a new compiler bug, a new library bug, or an old library bug that is triggered by new compiler behaviour (or an old compiler bug that is triggered by new code).

I would be in favour of a monorepo for everything that links against LLVM libraries and everything else being in separate repos.

> Overall, it seems clear that either approach could work, but mono seems to win out because it is more popular and more simple. It would require tweaks to LLVM’s cmake system though: instead of deciding to build a subproject based on whether it is checked out, it should instead be based on configuration time flags.

I believe that most of this works already - you can opt out of building components that are checked out.


More information about the llvm-dev mailing list