[llvm-dev] [RFC] One or many git repositories?

Mehdi Amini via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 20 17:53:07 PDT 2016


> On Jul 20, 2016, at 5:36 PM, Justin Bogner via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> writes:
>> On Wed, Jul 20, 2016 at 5:02 PM Justin Bogner via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> 
>>> Justin Lebar via llvm-dev <llvm-dev at lists.llvm.org> writes:
>>>> I would like to (re-)open a discussion on the following specific
>>> question:
>>>> 
>>>>  Assuming we are moving the llvm project to git, should we
>>>>  a) use multiple git repositories, linked together as subrepositories
>>>> of an umbrella repo, or
>>>>  b) use a single git repository for most llvm subprojects.
>>>> 
>>>> The current proposal assembled by Renato follows option (a), but I
>>>> think option (b) will be significantly simpler and more effective.
>>>> Moreover, I think the issues raised with option (b) are either
>>>> incorrect or can be reasonably addressed.
>>>> 
>>>> Specifically, my proposal is that all LLVM subprojects that are
>>>> "version-locked" (and/or use the common CMake build system) live in a
>>>> single git repository.  That probably means all of the main llvm
>>>> subprojects other than the test-suite and maybe libc++.  From looking
>>>> at the repository today that would be: llvm, clang, clang-tools-extra,
>>>> lld, polly, lldb, llgo, compiler-rt, openmp, and parallel-libs.
>>> 
>>> FWIW, I'm opposed. I'm not convinced that the problems with multiple
>>> repos are any worse than the problems with a single repo, which makes
>>> this more or less just change for the sake of change, IMO.
>>> 
>> 
>> It would be useful to know what problems you see with a single repo that
>> are more significant. In particular, either why you think the problems
>> jlebar already mentioned are worse than he sees them, or what other
>> problems are that he hasn't addressed.
> 
> Running the same 'git checkout' commands on multiple repos has always
> been sufficient to manage the multiple repos so far - as long as you
> create the same branches and tags in each repo, it's easy[1] to manage
> the set of repos with a script that cd's to each one and runs whatever
> git command.
> 
> So it's a pretty minor inconvenience today to have the multiple repos in
> the case where you want to check out all of them.
> 
> OTOH, if all of the repos are combined into one, you have to do work
> when you only want some of them. In my experience, this is basically
> always - between my various machines and projects I have a several
> checkouts of llvm+compiler-rt+clang+libc++, and I have a lot of
> checkouts of just llvm. I've only checked out the other repos when I was
> changing APIs and needed to update them.
> 
> I haven't tried the options jlebar has described to deal with these -
> sparse checkouts and whatnot, but they seem like an equivalent amount of
> work/learning curve as writing a script that cd's to several directories
> and runs the same git command in each.
> 
> Thus, this also sounds like a minor inconvenience. I just don't see how
> trading one for the other is worth doing, since AFAICT they're equally
> inconvenient.

IIUC you seem to explain that there are minor inconveniences on both side, but then I’m not sure about why you are opposed? It seems pretty equal,

Also the minor inconvenience in the case of the monolithic repository is happening during the initial setup/clone/checkout, and not during day-to-day development (git pull, git checkout -b, git commit, git push), while the split model induces “minor inconveniences” in the day-to-day developer interaction.
I.e. I prefer using a script to checkout and setup the repo, and then be able to use the standard git commands for interacting with it.


> [1] My understanding of the "umbrella repo" thing for bisecting is that
>    it'll be managed automatically by a cron or checkin hooks or
>    whatever,

That’s also something that is fragile to me without a deterministic way to reconstruct it identically from scratch using only the split repositories (which would be possible with "git notes” attached by a server-side hook for instance, but unfortunately github does not allow it, and the current split-repository proposal exclude even *discussing* the merits of other hosting services).


> so the bit's in jlebar's description about updating
>    submodules seem like a red herring. I'm assuming that we end up in a
>    place where working with git is essentially the same as we work with
>    git-svn today.

Some people manage today to have a single commit that update clang+llvm at the same time. 
I believe doing this in the split-repository model requires write-access to the umbrella repo.


— 
Mehdi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160720/556030e2/attachment.html>


More information about the llvm-dev mailing list