[llvm-dev] [RFC] One or many git repositories?

Wed Jul 20 18:17:23 PDT 2016

> I actually would like to see an example of how you would checkout a common subset with the sparse checkout feature. jlebar, could you give us demo commands for this?

$ git clone --depth 1 https://github.com/llvm-project/llvm-project.git
$ cd llvm
$ ls
clang  clang-tools-extra  compiler-rt  dragonegg  klee ...
$ git config core.sparsecheckout true
$ echo "/llvm
/clang" > .git/info/sparse-checkout
$ git read-tree -mu HEAD
$ ls
clang llvm

I suppose you could even wrap this in a script and ship that with
llvm, if you wanted.

On Wed, Jul 20, 2016 at 5:46 PM, Chandler Carruth via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> On Wed, Jul 20, 2016 at 5:36 PM Justin Bogner <mail at justinbogner.com> wrote:
>>
>> Chandler Carruth <chandlerc at google.com> writes:
>> > On Wed, Jul 20, 2016 at 5:02 PM Justin Bogner via llvm-dev <
>> > llvm-dev at lists.llvm.org> wrote:
>> >
>> >> Justin Lebar via llvm-dev <llvm-dev at lists.llvm.org> writes:
>> >> > I would like to (re-)open a discussion on the following specific
>> >> question:
>> >> >
>> >> >   Assuming we are moving the llvm project to git, should we
>> >> >   a) use multiple git repositories, linked together as
>> >> > subrepositories
>> >> > of an umbrella repo, or
>> >> >   b) use a single git repository for most llvm subprojects.
>> >> >
>> >> > The current proposal assembled by Renato follows option (a), but I
>> >> > think option (b) will be significantly simpler and more effective.
>> >> > Moreover, I think the issues raised with option (b) are either
>> >> > incorrect or can be reasonably addressed.
>> >> >
>> >> > Specifically, my proposal is that all LLVM subprojects that are
>> >> > "version-locked" (and/or use the common CMake build system) live in a
>> >> > single git repository.  That probably means all of the main llvm
>> >> > subprojects other than the test-suite and maybe libc++.  From looking
>> >> > at the repository today that would be: llvm, clang,
>> >> > clang-tools-extra,
>> >> > lld, polly, lldb, llgo, compiler-rt, openmp, and parallel-libs.
>> >>
>> >> FWIW, I'm opposed. I'm not convinced that the problems with multiple
>> >> repos are any worse than the problems with a single repo, which makes
>> >> this more or less just change for the sake of change, IMO.
>> >>
>> >
>> > It would be useful to know what problems you see with a single repo that
>> > are more significant. In particular, either why you think the problems
>> > jlebar already mentioned are worse than he sees them, or what other
>> > problems are that he hasn't addressed.
>>
>> Running the same 'git checkout' commands on multiple repos has always
>> been sufficient to manage the multiple repos so far - as long as you
>> create the same branches and tags in each repo, it's easy[1] to manage
>> the set of repos with a script that cd's to each one and runs whatever
>> git command.
>
>
> A notable difference is the ability to do API updates across them or the
> ability to bisect across them.
>
> Also, if the infrastructure that keeps the umbrella repo in sync falls over
> or has a serious problem, reconstructing version-locked state in order to
> bisect across those regions of time seems quite challenging. So IMO, it
> isn't a minor inconvenience, even if it is something we could overcome.
>
>>
>> So it's a pretty minor inconvenience today to have the multiple repos in
>> the case where you want to check out all of them.
>>
>> OTOH, if all of the repos are combined into one, you have to do work
>> when you only want some of them. In my experience, this is basically
>> always - between my various machines and projects I have a several
>> checkouts of llvm+compiler-rt+clang+libc++, and I have a lot of
>> checkouts of just llvm. I've only checked out the other repos when I was
>> changing APIs and needed to update them.
>>
>> I haven't tried the options jlebar has described to deal with these -
>> sparse checkouts and whatnot, but they seem like an equivalent amount of
>> work/learning curve as writing a script that cd's to several directories
>> and runs the same git command in each.
>
>
> I actually would like to see an example of how you would checkout a common
> subset with the sparse checkout feature. jlebar, could you give us demo
> commands for this?
>
> In particular, I've had a lot of folks come up and ask me for my script to
> walk all the directories and run the appropriate git commands in them, and
> if it is easier to have the GettingStarted page document how to use the
> sparse checkout thing, that would be nice.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>