[llvm-dev] [RFC] One or many git repositories?

Mehdi Amini via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 29 10:01:34 PDT 2016

> On Jul 29, 2016, at 2:19 AM, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote:
> On 29 Jul 2016, at 05:11, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> What I meant by “different problem" is that “downstream users” for instance don’t need to commit, that makes their problem/workflow quite different from an upstream developer (for instance it is fairly easy to maintain a read-only view of the existing individual git repo currently on llvm.org).
> I’m not convinced by this distinction.  A lot of downstream developers need to patch LLVM and we benefit when they upstream their changes.  

I made a difference between downstream users and developers. I.e. someone that just need to get and build compiler-rt vs someone that want to *commit* to LLVM. Note that even by getting a single repo you can still send a patch to the mailing list and someone can commit it for you (including correct author attribution contrary to SVN).

> We should not make it harder for them to do this.  To give a couple of example downstream projects, both FreeBSD and Swift have patches on LLVM / Clang in their versions that they gradually filter upstream.  Both projects have LLVM committers among their members.  If the workflow that we recommend for them makes upstreaming easy then they benefit (maintaining a fork is effort) and LLVM benefits (having people provide bug fixes makes our code better).
> The workflow that we want to recommend to these people is:
> - Fork the repo that you’re interested in from the LLVM GitHub organisation
> - Make your changes
> - Send pull requests for anything that you think is of interest to upstream

Note that the workflow you describe above still requires to export their patch and import it in this clone before pushing.
(Note also that we accept patches on the mailing list, so one does not even need to clone the official repo).

> This makes the barrier to entry for sending code back upstream *much* lower than it currently is,

I don’t understand this statement. As of today you can send a diff to the mailing list, I don’t see how lower the bar can be.

> to the benefit of all.  If the alternative is:
> - Fork a read-only repo that you’re interested in from the LLVM GitHub organisation
> - Make your changes

Why? If you know you want to *push* commits upstream, fork the only useful repo for that in the first place.

> - Fork a different repo from the LLVM GitHub organisation
> - Run a script to filter some of your changes into that one

I don’t know why you think there is a need for a script, or why it is different from today.
Let say I’m working on a fork of the compiler-rt read-only repo and I want to upstream a patch at some point:


- cd /path/to/compiler_rt-forked
- git format-patch …
- cd /path/to/compiler_rt-upstream
- git am  /path/to/compiler_rt-forked/0001-My-awesome-changes.patch
- git svn dcommit
- done

Tomorrow with a monorepo:

- cd /path/to/compiler_rt-forked
- git format-patch …
- cd /path/to/unifiedrepo-upstream
- git am  /path/to/compiler_rt-forked/0001-My-awesome-changes.patch —directory=compiler-rt
- git push
- done

Alternatively, if I’m upstream a patch once a year, I don’t really need to push it myself. 

- cd /path/to/compiler_rt-forked
- git format-patch …
- email the patch.

> - Send a pull request from that

Note that I think we deferred any change to the workflow for future discussions (pull-request are not part of our workflow today).

> - Deal with merging between the two yourself

I don’t know what you mean by dealing with the merging, I don’t expect any difficulties, you need to elaborate.

> I strongly suspect that we’ll get a lot fewer useful contributions from downstream.  Or downstream people will just work on the monorepo and eat the cost.
> If someone is working on a downstream LLVM project and becoming familiar with our codebase, then we want them to be subtly nudging their workflow so that they eventually become LLVM contributors without noticing!

Sure. The distinction between “downstream users” and “developers” was made in response to “there exists many user that just download and build a subproject”. These are not people that are *developing* on a downstream fork.


More information about the llvm-dev mailing list