[llvm-dev] [RFC] One or many git repositories?

Wed Jul 20 18:26:40 PDT 2016

Justin Lebar <jlebar at google.com> writes:
>> Running the same 'git checkout' commands on multiple repos has
>> always been sufficient to manage the multiple repos so far
>
> Huh.  It definitely hasn't worked well for me.
>
> Here's the issue I face every day.  I may be working on (unrelated)
> changes to clang and llvm.  I update my llvm tree (say I checked in a
> patch, or I want to pull in changes someone else has checked in).  Now
> I want to go back to hacking on my clang stuff.  Because my clang
> branch is not connected to a specific LLVM revision, it no longer
> compiles.  I'm trying to build an old clang against a new llvm.
>
> Now I have to pull the latest clang and rebase my patches.  After I
> deal with rebase conflicts (not what I wanted to do at the moment!),
> I'm in a new state, which means when I build my ccache is no help.
> And when I run the clang tests, I don't know whether to expect test
> failures.  So then I have to pop of my patches and run at head...
> (Maybe I have to update clang!  In which case I also have to update
> llvm...)
>
> This would all be solved with zero work on my part if llvm and clang
> were in one repository.  Then when I switched to working on my clang
> patches, I would automatically check out a version of LLVM that is
> compatible.
>
> I think this is the main thing that people aren't getting.  Maybe
> because it's never been possible before to have a workflow like this.
> But having a git branch that you can check out and immediately build
> -- without any rebasing, re-syncing, or other messing around -- is
> incredibly powerful.

I don't know man, when I create a branch to save my clang work I just
create a branch with the same name in all the other repos I have checked
out, then it just stays in the state I left it in as I go do other
stuff. This kind of problem just hasn't really come up for me.

> Please let me know if this is still not clear -- it's kind of the key point.
>
> As I said, you can accomplish this with submodules, too, but it
> requires the complex hackery from my original email.
>
> To me, this is not at all a minor inconvenience.  It's at least an
> hour of wasted time every week.
>
>> I haven't tried the options jlebar has described to deal with these
>> - sparse checkouts and whatnot, but they seem like an equivalent
>> amount of work/learning curve as writing a script that cd's to
>> several directories and runs the same git command in each.
>
> I'll send sparse checkout instructions separately.  But my example
> submodules commands are not at all equivalent to a script that cd's
> into several directories and runs a git command in each, and I think
> this is the main point of confusion.  (In fact you wouldn't need to
> write such a script; it's just "git submodule foreach".)
>
> The submodules commands creates a single branch in the umbrella repo
> that encompasses the checked-out state of *all the LLVM subrepos*.  So
> you can, at a later time, check out this branch in the umbrella repo
> and all the clang, llvm, etc. bits will be identical to the last time
> you were on the branch.
>
> If all you want is to continue using git the way you use it now, the
> multiple git repos gets you that (as does a sparse checkout on the
> single repo).  My point is that, the move to git opens up a new, much
> more powerful workflow with branches that encompass both llvm and
> clang state.  We can do this with or without submodules, but using
> submodules for this is far more awkward than using a single repo.

If I do `git log` in a sparse checkout that just has LLVM, will it only
show me LLVM commits? That is, how easy is it to filter out the
clang/lldb/subproject-X commits from a log? Negative globs are kind of
awkward.