[llvm-dev] [RFC] One or many git repositories?

Mon Jul 25 06:29:01 PDT 2016

> On 25 Jul 2016, at 13:34, Renato Golin <renato.golin at linaro.org> wrote:
> 
> On 25 July 2016 at 12:40, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote:
>> I’d suggest that no one should have direct push access
> 
> That's what I meant by "massive change". :)
> 
> I don't think that's a bad model, but it's a very different model than
> we currently have, so would need more discussion on that side before
> we can even think about the technical details.
> 
> 
>> - User sends pull request.
>> - User from a list of people that correspond to our current set of committers labels the request as ‘approved’ (GitHub labels can only be applied by members of the organisation).
>> The buildbot master then:
>> - Collects all of the pull requests (git fetch)
>> - Uses the GitHub API to filter the ones that have been approved
>> - Merges these, in turn, to its local master branch
>> - Sends those changes to the builders
>> - If they succeed, pushes master to GitHub
> 
> This sounds a lot like the Gerrit model.
> 
> Can buildbots do that? How much code would be needed?
> 
> Wouldn't it be easier to do this in Jenkins (push based)?

These are both good questions for someone more familiar with the Buildbot infrastructure than me.

> 
> 
>> In idle time (when not busy with approved patches), it should also try other pull requests and add a label saying that they’ve passed the tests.
> 
> This sounds like an added complexity to the buildbot code. (priority
> queues?) Not impossible, maybe not even hard, but yet another small
> thing to get right…

Not hard - there are two queues (stuff that’s ready to merge, stuff awaiting review) and you only start on the second queue when the first one is empty.

>> There are some obvious refinements to this, for example only kicking the fast buildbots when there are several patches outstanding and then running the slower ones with the combined master, then backtracking and applying them one at a time if this fails.
> 
> If this is going to work, we'll need to run a single pull request on
> top of the correct "master" for every change.
> 
> IIUC, GitHub will "trigger" a CI change for every *update* on every
> pull request, too. So we'll end up with a lot more half-baked patches
> than usually go through the current buildbots.

Yes, GitHub can do this on a push model, but the first filter would be ‘does this have the correct label’, and if not then it goes on the low-priority pile.

> 
> This will mean more builders per bot (I'm expecting at least 3) and a
> silent bots (we can use 8014 for that).
> 
> 
>> I don’t have particularly strong feelings about this, I just wanted to point out that it’s possible with GitHub and a little bit of client code talking to the GitHub APIs.  There’s a slightly more complex workflow for people to push code, but in exchange for that we’d never be in a state where anything pushed to master introduces regressions for which tests exist.
> 
> I'm a big advocate of pre-commit tests and the Gerrit model, I think
> it makes developers more aware of their own failures (which bbots do)
> without upsetting the rest of the community (which bbots don't).
> 
> For instance, in this model, I only need to review code that has been
> marked by the pre-commit bots as "good", so it also saves a lot of
> time from code reviewers, in addition to buildbot owners, and it will
> make the releases, back-porting, bisecting, *much* easier by the
> massive reduction in silly reverts.

These are the advantages.  The disadvantages are:

- Someone needs to write the Buildbot integration code

- Committers need to adopt a slightly more complex workflow (though not more complex than going via phabricator)

> If I understand correctly, the model you propose is:
> 
> 1. Admins: people with direct commit access used only for fixing the
> repo, fork releases, etc.

Yup.

> 2. Pre-review committers: people with the right to create a
> "auto-merge" tag on pull reviews, but not commit directly.

Yup.

> 3. Post-review committers: just pull request access, need to wait for
> someone in (2) to approve

Yup.

> 4. Everyone else: just pull request access, need to wait for someone
> in (2) to approve

There’s not really a difference between 3 and 4 in this model.

> 
> A few problems with this:
> 
> A. How do we divide the current (fuzzy) "groups" into those well defined groups?

Group 4 is easy, that’s everyone that we don’t actively put in a group.

I have no idea about group 1.  This set roughly corresponds today to the set of people who are Chris, and should probably be half a dozen people for a bit more redundancy (admins are also the people who can add people to the GitHub group, create new repos, and so on).  It might be something overseen by the foundation.

For groups 2/3 as they map to current committers, the same way that we do now:

Give people membership in the group and let them know what the community expects them to have reviewed pre-commit.  If they start committing things that should have been reviewed without proper review and they don’t change their behaviour, then move them back into group 4.

> B. (2) folks will have the choice to get reviews, (3) won't have the
> choice to quickly fix their commits (which some do today).
> C. The migration between (3) to (2) is not defined *at all* today, and
> this is where the most radical cut will be.

If the difference is purely cultural and not mechanically enforced, then (3) will be able to fix their commits by labelling them as approved.

I don’t see a need for a mechanical separation between groups 2 and 3.  In FreeBSD we have a concept of mentorship, where new committers must have ‘Approved by: {name} (mentor)’ in all of their commits and their mentor is responsible for ensuring that they are following the expectations of the community.  This is similarly not enforced by the revision control system (though, again, people who don’t do it may find that they can no longer commit).

> Post-review commit access is, as I understand, a social recognition
> for those striving to become more active committers in the future. By
> flattening (3) and (4), any threshold we pick will be "wrong" in some
> form, and may discourage people from contributing changes, or striving
> to become pre-review committers.

I’m not proposing flattening 3 and 4.

> I'm not saying you're wrong, or that I don't agree with you, but this
> won't be an easy sell…

It’s also worth pointing out that this is in no way required to happen at the same time as the GitHub migration (or at all), so should not delay that.  Just something for people to think about...

David