[llvm-dev] Allowing PRs on GitHub for some subprojects

Mon Mar 9 12:20:40 PDT 2020

Hello all,

I'd like to share my thoughts on this topic. Most of my experience with
LLVM is through libc++ so, that's what I'll talk about but, I assume that
most of this applies to other sub-projects as well.

There are four things that I think everyone would agree are important to
allow us to succeed in collaborating on a shared codebase. These things
are:

   1. A good reviewing tool.
   2. Issues tracking.
   3. A good testing framework.
   4. The ability for the maximum number of people to contribute
   high-quality patches.

It's hard to compare the review tool. Lots of people have strong opinions
about it, and it's mostly subjective.

We have already moved some sub-projects to GitHub issues. This means we
have four different places to look for the "status" of something. We have
the status page, GitHub issues, Bugzilla, and Phabricator/GitHub. To check
if something has been implemented, a person has to look through an
undefinable combination of those sites. Imagine this instead: the status of
issues in the current implementation, LWG issues, and papers would all be
tracked on GitHub. Those issues could be linked to a patch made on GitHub,
and both the issue and the patch could be assigned to someone. Currently,
to check the status of something, you have to look through all four of
those platforms above and maybe ask on the mailing list to check if
someone's working on it already. If we use GitHub, we eliminate both the
difficulty of checking and the semi-frequent occurrence of two people
implementing the same thing.

Libc++ has a great set of unit tests but, they have lots of different
(usually inline) configurations. Often (especially for newer contributors),
all the tests are run in several configurations, and then the patch is
committed. Later the buildbots find something, and the patch has to be
reverted. This is unnecessary. If the buildbots were run *before* the patch
lands, we could catch the majority of the errors early.

When a patch is submitted to libc++ by anyone except the three maintainers,
a few things happen. First, it's reviewed. That process works well and
shouldn't change too much. Second, it often needs to be tested by one of
the maintainers. While I'm not a maintainer of libc++, I can imagine that
this process can be time-consuming, especially considering the volume of
patches submitted. Last, the change needs to be committed. This involves
downloading the raw patch, applying it, and committing on behalf of the
person. Then, watching the bots and sometimes reverting the patch. This is
also time-consuming. On GitHub, we can configure the PRs with requirements.
These requirements can be things like, "x number of maintainers approve
this patch." Or, "the following buildbots succeed." These requirements mean
that the reviewers are no longer responsible for committing, running the
tests, or reverting. GitHub will make sure the tests pass before anything
is committed, then after it's been approved, the contributor can commit and
revert the code themself (without commit access).

TL;DR: There are only three maintainers and their time is very valuable.
Currently, we waste lots of it by having them manually go through a series
of steps that are error-prone and time-consuming, but easily automatable.
We should automate them such that they have a higher success rate and don't
waste valuable time.

Best,

Zoe

On Wed, Mar 4, 2020 at 8:14 AM Louis Dionne via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Mehdi, Chris & others,
>
> I guess I did not express the main reasons for wanting to switch over very
> well in my original message. Like Christian talked about, for me it's all
> about pre-commit testing. I believe pre-commit testing is a widely shared
> desire among this community. However, how badly it is missed depends on
> sub-projects, because they have different realities. For example, in libc++:
>
> 1. We have a lot of first-time contributors, which means that the
> maintainers end up shepherding many contributions. In particular, this
> often means fixing small breakage following their changes, which can be
> difficult for them because they can't reproduce the failures locally, and
> they might not even know where to look. While these contributors can submit
> valuable improvements and bug fixes, we can't expect them to fix every last
> platform that we support in the current state of things -- it's hard, it's
> boring, and it's stressful.
>
> 2. Our testing matrix is very large, and interactions between different
> configurations (usually #ifs/#elses) is very subtle. This means the rate of
> mistake-on-first-try is, I think, higher in libc++ than in most other LLVM
> projects. Even with careful review, I find that a large percentage of
> changes end up breaking something somewhere, and I have to fix it (usually
> quickly enough to avoid reverting).
>
> As a result, the lack of pre-commit testing is actively harming the health
> of libc++ as a project. It might be true for other projects as well, but I
> can only speak for libc++ because that's where I have first hand
> experience. Unfortunately, we currently don't have a good way of doing
> pre-commit testing on Phabricator AFAICT. From the Harbormaster
> documentation [1]:
>
>     You'll need to write a nontrivial amount of code to get this working
> today. In the future, Harbormaster will become more powerful and have more
> builtin support for interacting with build systems.
>
> So while I appreciate all the efforts being made in this area, I still
> don't even know where to start if I want to setup pre-commit testing for
> libc++ today. However, the path is very clear with GitHub PRs and there are
> many options available.
>
> Whenever I hear arguments of dividing the community, not being able to
> share infrastructure, the lack of Herald -- those all make a lot of sense
> to me and I think they're good arguments. However, it is clear that folks
> who even think about these arguments are not paying the same cost for the
> lack of good pre-commit testing that I'm paying on a weekly basis, because
> for me that outweighs everything else.
>
> I don't know how to come to a decision here, all I know is that libc++
> needs to get out of the status quo soon. And if the solution is that
> Harbormaster suddenly becomes usable without an unreasonable time
> investment from me, then I'm fine with that too. I'm not looking to switch
> to GitHub PRs for the sake of it, I'm looking to solve problems that are
> harming libc++ in the current system.
>
> Cheers,
> Louis
>
> [1]: https://secure.phabricator.com/book/phabricator/article/harbormaster/
>
>
> On Feb 29, 2020, at 23:06, Mehdi AMINI <joker.eph at gmail.com> wrote:
>
>
>
> On Tue, Feb 25, 2020 at 4:19 AM Christian Kühnel via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi Louis,
>>
>> I think this is a good idea. We should start with some local experiments
>> where people are willing to try it and figure out how well that works and
>> what does not. Why not allow this for "not significant" changes? They are
>> merged without review today, so we could do them with reviews (and
>> automated tests) via pull requests instead.
>>
>
> I still feel this is only a recipe for confusion if "some" pull-requests
> are accepted on Github but not all. So -1 from me on this.
>
>
>>
>> @Mehdi
>>
>>> - it does not favor to build common tooling: the recent work on enabling
>>> pre-submit CI tests on Phabricator is valuable and I'm looking forward to
>>> get this extended. But splitting the various ways of contributing to the
>>> repo just means more infrastructure to build to sustain this kind of
>>> efforts. (the infrastructure is easier built on GitHub by the way, but
>>> that is an argument in favor of migrating from Phab to GH for the
>>> full-project).
>>>
>>
>> Oh I'm happy to add Github support as soon as someone switches on PRs.
>> This is soooooo much easier to set up and maintain than the Phabricator
>> integration. And we already have builds for the release branch (
>> https://buildkite.com/llvm-project/llvm-release-builds) anyway. So we
>> could easily scale that up. And we can only get pre-merge testing on
>> Phabricator to a certain point, as it's not triggering builds for ~50% of
>> the code reviews.
>>
>> @Chris Lattner
>>
>>> Although I am one of the (many) people who would love to see us move
>>> from Phabricator to GitHub PRs, I think it is super important that we
>>> do the transition all at once to keep the LLVM community together.  I’m
>>> already concerned about the fragmentation the discourse server is causing,
>>> e.g. MLIR not using a -dev list.  I’d rather the community processes stay
>>> consistent.
>>>
>>
>> Please allow me to disagree there. IMHO we're way too large and diverse
>> of a project to do binary, overnight transitions.
>>
>
> You seem to be arguing the "how to transition" while there is no agreement
> on a transition happening in the first place.
>
>
>> We're also too large to follow a one-size-fits-all approach. If we agree,
>>
>
> I don't: we went with a monorepo because we believed that the
> one-size-fits-all would be more beneficial than splitting, both in terms of
> infrastructure, but also in terms of the practices of the community, etc.
>
> Github PRs are the right glow, why take this step-by-step. We should have
>> something like a list of important and supported use cases/interactions for
>> the infrastructure. Then we could start working on them one-by-one and
>> figure out if/how they could be implemented on Github and how we could do a
>> smooth transition between these.
>>
>> If Herald rules are important: Find a way to implement something similar
>> for Github. Maybe there is even a market for such a tool.
>> If transparency is the problem: Find a way to mirror PRs into
>> Phabricator, so people can at least see them there.
>> We're not restricted to community contributions there. We can also pay
>> someone to build the things we need.
>>
>
> One aspect here though is that we can pay someone to build the things we
> need in Phabricator, we can't change GitHub though.
> It was mentioned in the past that we should engage with GitHub and see if
> they would add the feature we're missing to their roadmap, if it hasn't
> been done I'd start there: building up this list of things that need to
> happens before we can agree towards a transition, and engaging with GitHub
> to have these.
>
> --
> Mehdi
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200309/55443c0d/attachment-0001.html>