[llvm-dev] [RFC] One or many git repositories?

Mon Jul 25 09:54:15 PDT 2016

Hi, all.

I feel like we've strayed pretty far from the question originally
posed in this thread.

One of the pieces of feedback I got before I started this thread was
that many people felt that, the last time the question of multiple
repos vs. monorepo was discussed, it was interspersed with other
topics, making it difficult for some people to weigh in appropriately
(or even to be aware that the discussion was occurring).  I'm afraid
that the discussion of github workflows we're having here may cause
the same problem.

Maybe we can move the discussion about github workflows into a
different thread?  Again, I don't mean to stop it, just move it.

To re-focus this thread on its original topic: It sounds to me like,
broadly speaking, we have consensus on using a single repository.  But
there are still some outstanding related questions.  Among these are:

1) Should the repository have "unified history"?  (Meaning, should I
be able to check out a single git revision from before the migration
and have it contain all of the llvm subprojects?)

2) Should the monorepo have a "nested" repository layout (e.g. clang
goes in /tools/clang) or a "flat" layout (clang goes in /clang)?

3) Assuming we want unified history, should the new canonical
repository's hashes be based on
https://github.com/llvm-project/llvm-project, or should it start
afresh?

FWIW my answers to these are:

1) Yes to unified history.  The main advantage of non-unified history
is that it's easier for people to import old branches -- it's a matter
of "git merge" instead of running the git filter-branch script I
wrote.  But this is a relatively small (~20 minute) one-time cost to
some of us, whereas our repository history is born by all of us
forever.  Moreover unified history also helps people with long-running
branches, as it lets them check out old versions of their branch and
get a compatible version of all of the other llvm subprojects.

2) Yes to nested layout.  I find Chandler and Richard Smith's
arguments compelling.

3) No to basing the new canonical repo on
https://github.com/llvm-project/llvm-project.  That repo's history is
missing svn revision numbers, and there are enough emails floating
around that reference svn revision numbers that I think we need them
in our canonical repo.  Also llvm-project/llvm-project has a flat
structure, and if we end up going with a nested layout, it would be
better to have that layout starting with the first commit.

-Justin

On Mon, Jul 25, 2016 at 8:10 AM, Bruce Hoult via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> git-imerge can run an arbitrary script to decide whether a commit is good or
> bad. Lack of textual merge conflicts is only the most basic test -- you can
> check that it compiles, run tests .. whatever you want and have time to
> execute.
>
> On Tue, Jul 26, 2016 at 2:12 AM, Robinson, Paul via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>>
>>
>> > -----Original Message-----
>> > From: Renato Golin [mailto:renato.golin at linaro.org]
>> > Sent: Monday, July 25, 2016 7:11 AM
>> > To: Daniel Sanders
>> > Cc: Robinson, Paul; llvm-dev at lists.llvm.org
>> > Subject: Re: [llvm-dev] [RFC] One or many git repositories?
>> >
>> > On 25 July 2016 at 14:55, Daniel Sanders <Daniel.Sanders at imgtec.com>
>> > wrote:
>> > > I know of a way but it's not very nice. The gist of it is to checkout
>> > the
>> > > downstream branch just before the bad merge and then merge the first
>> > > 100 commits from upstream. If the result is good then merge the next
>> > > 100, but if it's bad then 'git reset --hard' and merge 10 instead.
>> > You'll
>> > > eventually find the commit that made it bad. Essentially, the idea is
>> > > to
>> > > make a throwaway branch that merges more frequently. I do something
>> > > similar to rebase my work to master since gradually rebasing often
>> > > causes all the conflicts to go away.
>> >
>> > This is essentially what git-imerge does, you only need to define
>> > "good merge" in the form of a script or CI job.
>> >
>> > cheers,
>> > -renato
>>
>> Except I understood git-imerge to be looking for physical conflicts,
>> not "when did this test start failing."  If it does the latter also,
>> that would be awesome.
>> --paulr
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>