[llvm-dev] [RFC] One or many git repositories?

Wed Jul 20 18:04:45 PDT 2016

Mehdi Amini <mehdi.amini at apple.com> writes:
>> Running the same 'git checkout' commands on multiple repos has always
>> been sufficient to manage the multiple repos so far - as long as you
>> create the same branches and tags in each repo, it's easy[1] to manage
>> the set of repos with a script that cd's to each one and runs whatever
>> git command.
>> 
>> So it's a pretty minor inconvenience today to have the multiple repos in
>> the case where you want to check out all of them.
>> 
>> OTOH, if all of the repos are combined into one, you have to do work
>> when you only want some of them. In my experience, this is basically
>> always - between my various machines and projects I have a several
>> checkouts of llvm+compiler-rt+clang+libc++, and I have a lot of
>> checkouts of just llvm. I've only checked out the other repos when I was
>> changing APIs and needed to update them.
>> 
>> I haven't tried the options jlebar has described to deal with these -
>> sparse checkouts and whatnot, but they seem like an equivalent amount of
>> work/learning curve as writing a script that cd's to several directories
>> and runs the same git command in each.
>> 
>> Thus, this also sounds like a minor inconvenience. I just don't see how
>> trading one for the other is worth doing, since AFAICT they're equally
>> inconvenient.
>
> IIUC you seem to explain that there are minor inconveniences on both
> side, but then I’m not sure about why you are opposed? It seems pretty
> equal,

I should clarify, this is a -0 kind of opposed. If people overwhelmingly
think this is the way to go, I won't try to block it or anything. I'd
rather not have to update a bunch of workflow, infrastructure, and bots
for no particular reason though.

> Also the minor inconvenience in the case of the monolithic repository
> is happening during the initial setup/clone/checkout, and not during
> day-to-day development (git pull, git checkout -b, git commit, git
> push), while the split model induces “minor inconveniences” in the
> day-to-day developer interaction.
> I.e. I prefer using a script to checkout and setup the repo, and then
> be able to use the standard git commands for interacting with it.
>
>
>> [1] My understanding of the "umbrella repo" thing for bisecting is that
>>    it'll be managed automatically by a cron or checkin hooks or
>>    whatever,
>
> That’s also something that is fragile to me without a deterministic
> way to reconstruct it identically from scratch using only the split
> repositories (which would be possible with "git notes” attached by a
> server-side hook for instance, but unfortunately github does not allow
> it, and the current split-repository proposal exclude even
> *discussing* the merits of other hosting services).

I haven't been following that discussion, but that seems surprising
since AFAICT the only particularly compelling reason to move away from
SVN is that it's easy to find good reliable hosting.

>
>> so the bit's in jlebar's description about updating
>>    submodules seem like a red herring. I'm assuming that we end up in a
>>    place where working with git is essentially the same as we work with
>>    git-svn today.
>
> Some people manage today to have a single commit that update
> clang+llvm at the same time.
> I believe doing this in the split-repository model requires
> write-access to the umbrella repo.