[llvm-dev] [RFC] One or many git repositories?

Mehdi Amini via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 28 21:11:42 PDT 2016


> On Jul 28, 2016, at 7:32 PM, Lang Hames <lhames at gmail.com> wrote:
> 
> Hi Mehdi,
> 
> This a narrow view IMO: the criteria #1 Chris mentioned to include projects in the monorepo was " must be tightly coupled to specific versions”. 
> It means that even with the test suite (and possibly some runtime) out of the monorepo, all the software that is tightly coupled would be in the monorepo, and that alone would be enough to alleviate the needs for (most of the) tooling/infrastructure.
> 
> Fair point, but coupling isn't binary: even the test-suite is coupled to the versions of clang that can compile it, it's just relatively loose compared to LLVM/clang. 
> 
> I find it a fairly different scale to clone 3 repos on a bot versus having to keep multiple repositories *in sync* (i.e. cross repository synchronization).
> 
> I think it depends on the nature of the tools that are required. Bots are relatively simple since they're only reading from the repos, not writing. They're not the only use-case I have in mind though.
> 
> Different problems, different tools… I’m against artificially creating “problems" for upstream developers only because the tooling to solve them works for downstream users.
> 
> I don't think these are actually different problems: I would guess that the problem of collecting some subset of the LLVM projects into a usable source-tree is shared by many downstream users, and it's common in my workflows (e.g. just checking out llvm and lld). It will have to be solved by someone, since downstream users need it even if we adopted a mono-repo.

What I meant by “different problem" is that “downstream users” for instance don’t need to commit, that makes their problem/workflow quite different from an upstream developer (for instance it is fairly easy to maintain a read-only view of the existing individual git repo currently on llvm.org <http://llvm.org/>).

Also while we can create scripts for (almost) every scenarios, one have to put in balance the script that is run once at checkout time vs the set of scripts required for day-to-day development: for example what if I want to switch my tree to my work-in-progress branch where I changed a LLVM library to use the new "Error checking” API and adapted all the other projects that using this API, and then I want to rebase this branch on master for every projects so that I can get ready to push. My impression is that a single repo makes this use-case trivial with a standard set of git commands.

I believe a repo like https://github.com/llvm-project/llvm-project <https://github.com/llvm-project/llvm-project> solves most of the workflows (both for developers and downstream users) with little to no tooling required. Providing a read-only export from this repo is also fairly easy, and can be done asynchronously in a deterministic way (contrary to the submodule umbrella update that requires some server-side hooks). 
The only two unanswered drawbacks that I got from this thread are:

1) A "major drawback of a single huge repo IMHO: In git, to push a commit you must have it at the remote HEAD. If HEAD has changed you need to rebase/rebuild/retest/retry. With a single monster repo, a commit to 'lld' means I have to go through this pain to put in my 'clang' tweak.”,  http://lists.llvm.org/pipermail/llvm-dev/2016-July/102656.html <http://lists.llvm.org/pipermail/llvm-dev/2016-July/102656.html>
2) Chris Bienemann: What about a *contributor* only wanting to contribute to compiler-rt? He has to pay the price of cloning the full repo. http://lists.llvm.org/pipermail/llvm-dev/2016-July/103052.html <http://lists.llvm.org/pipermail/llvm-dev/2016-July/103052.html>

I haven’t seen a good answer for 1), and for 2) it’ll come down to a balance of “how much a burden it is in 2016 to download 500MB once to contribute to a project”, and how many people (and number of commits) does this represent?

> A shared solution (if it's possible) may be an opportunity to both share infrastructure with downstream projects and adopt a more modular approach to the LLVM project sources.

I had the impression that the current situation is that sources are “modular”, and that’s painful when you work cross-projects (luckily I have been focused on LLVM itself lately…).
On the opposite of a “more modular approach to the LLVM project sources”, I’d favor a goal toward "a more coherent approach to maintaining the LLVM projects sources”.

> I'm staying deliberately light on specifics here. As I said I don't have strong feelings yet -- I'm still digesting all the ideas in this thread.

The other thread on the submodules proposal driven by Renato has also a lot of ideas/workflow descriptions if you’re looking for inspiration.

— 
Mehdi



> To the extent that I have a gut feeling though, this feels like it introduces very strong coupling between LLVM project sources (more than is required by the projects APIs) for the sake of convenience, so I'm trying to consider the alternatives.
> 
> Cheers,
> Lang.
> 
> 
> On Thu, Jul 28, 2016 at 6:41 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
> 
>> On Jul 28, 2016, at 6:23 PM, Lang Hames via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>> Aaaand I'm (mostly) caught up. Phew.
>> 
>> FWIW Chris B is right: I had been put off commenting on this thread by the length, and the number of git discussions that have come before this. He convinced me to make the effort to put my 2 cents in though - thanks Chris.
>> 
>> So - for my use-case I don't have strong feelings one way or the other* <https://www.youtube.com/watch?v=fpaQpyU_QiM>. That said, something about the discussion so far strikes me as dissonant: If we're going to break out some sub-projects (the test-suite for licensing reasons, the runtimes for modularity) then it's not really a mono-repo any more. It's a multi-repo where we've collapsed some (but not all) of the existing repos.
> 
> This a narrow view IMO: the criteria #1 Chris mentioned to include projects in the monorepo was " must be tightly coupled to specific versions”. 
> It means that even with the test suite (and possibly some runtime) out of the monorepo, all the software that is tightly coupled would be in the monorepo, and that alone would be enough to alleviate the needs for (most of the) tooling/infrastructure.
> 
> 
>> To the extent that we have to build tooling to support multiple-repos (auto-mergers for test bots, command line utils for devs who want the main repo plus tests plus ...), could we re-use that to keep the existing modular project setup?
> 
> I find it a fairly different scale to clone 3 repos on a bot versus having to keep multiple repositories *in sync* (i.e. cross repository synchronization).
> 
> 
>> This might be a fairly low-benefit proposition if the tools we develop were only usable by in-tree projects, but there are many other users of LLVM (Swift leaps to mind since I'm at Apple, but there are many others) who might appreciate the ability to use LLVM-provided tools to pick-and-mix LLVM projects into their repos. Otherwise, every downstream user will have to roll some version of these tools themselves. 
> 
> Different problems, different tools… I’m against artificially creating “problems" for upstream developers only because the tooling to solve them works for downstream users.
> 
>> Mehdi
> 
> 
>> 
>> On Thu, Jul 28, 2016 at 3:19 PM, Renato Golin via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> On 28 July 2016 at 22:12, Chris Bieneman <beanz at apple.com <mailto:beanz at apple.com>> wrote:
>> > It is worth pointing out the Jenkins job that runs that is a playground I setup for myself. It is nowhere near production ready, and it will fail frequently as I iterate messing around with it.
>> 
>> Sure, I think that's implied.
>> 
>> cheers,
>> --renato
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160728/e201be49/attachment.html>


More information about the llvm-dev mailing list