[llvm-dev] Enable Contributions Through Pull-request For LLVM
Mehdi AMINI via llvm-dev
llvm-dev at lists.llvm.org
Fri Nov 8 00:21:21 PST 2019
On Thu, Nov 7, 2019 at 1:16 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:
> On 11/7/19 1:50 PM, Mehdi AMINI wrote:
>
>
>
> On Thu, Nov 7, 2019 at 9:32 AM Finkel, Hal J. via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>> On 11/7/19 10:13 AM, Jameson Nash wrote:
>>
>> I don't intend to weigh in on either side, but just give a perspective on
>> a few questions asked.
>>
>> From an outsiders perspective, that list isn't what I'd typically
>> describe as the workflow, since it makes "fork" and "branch" sound like
>> difficult operations. This sounds akin to thinking that someone would
>> reclone the svn repo before working on a new commit—possible, but rather
>> extreme. The distributed nature of git means that the fork is pretty simple
>> for github (essentially creates a new branch in their database, but that is
>> just defining a new name for an existing commit).
>>
>>
>> FWIW, I've not found it to be so straightforward in practice. If you have
>> multiple LLVM forks, representing different distinct projects, because
>> GitHub only lets you have one fork per account, the process via which you
>> can work with multiple derived projects is actually quite annoying.
>>
>
> I didn't quite get why you need multiple forks for a given canonical
> repository? Maybe you're trying to setup a non-LLVM sub-project being
> developed in a fork of the LLVM repo?
>
>
> Yes. If I have one project on optimizations of accelerator constructs of
> LLVM, and I have another project on adding quantum-computing extensions to
> LLVM, and a third project on an LLVM-based high-level-synthesis tool, then
> I want three separate forks. When someone getting these repositories runs
> 'git branch -a', they should only see branches related to the project on
> which they're working, and each repository should contain only data files
> and history related to that project (which might be large). In addition,
> each of these projects needs to have its own wiki pages, track its own
> bugs, etc. To put it another way, when someone grabs some fork on LLVM on
> which I'm working, it's not "Hal's LLVM that happens to have a branch that
> does X", but rather, "A named fork of LLVM for a particular project".
> Moreover, any of these projects might have changes that we want to pull
> into a branch for upstream contribution.
>
Right that makes sense: it think for such purpose I wouldn’t use a personal
fork of the canonical repo, I would likely just create a new repo for the
purpose of this project (GitHub would not track this as a « fork »
directly, but that shouldn’t matter that much for developing such a project
would it?).
Actually using llvm-commits@ or Phab does not make it easier does it?
Best,
—
Mehdi
> -Hal
>
>
>
> Thanks,
>
> --
> Mehdi
>
>
>
>> GitHub has a page on this (
>> https://github.community/t5/Support-Protips/Alternatives-to-forking-into-the-same-account/ba-p/7428),
>> but all of the work-arounds are pretty bad.
>>
>> In short, only the first fork is easy. After that, it gets harder.
>>
>> -Hal
>>
>>
>> I've noticed GitHub recently made it easier to delete the fork, but I'd
>> argue most users shouldn't care. What you'd typically do instead is to do
>> `git remote add <name> <url>` and/or use their `hub` CLI wrapper (
>> https://hub.github.com) to synchronize work between multiple remote and
>> local locations.
>>
>> > On the note of branches for PRs, don't they require users to push their
>> local branches to the remote repo to create? That means we'll end up
>> thousands of branches in git.
>>
>> Yes, but a branch is just a name for a particular list of commits. Unlike
>> svn, it is not the commits themselves, so they have almost no performance
>> implications. I don't think LLVM would want everyone pushing to their work
>> in progress to the central repo, so the "thousands of branches" would be
>> scattered across each individual contributor's repo, and would represent an
>> likely inconsequential fraction of the number of branches already on GitHub.
>>
>> > Impossible to chain reviews - a PR diff can only be made on top of git
>> master branch
>> It is possible to make a PR against someone else's PR on their fork,
>> although I hesitate to compare this to the far more usable phabricator
>> capability, so perhaps "impossible" is the right term anyways.
>> By contrast though, GitHub has no intrinsic concept of a "master" branch.
>> Even aside from the question of using GitHub PRs, you can ask GitHub to
>> make a diff between any two (or three) points in time in a few ways, such
>> as you could also do locally with git. For example, this'll show you the
>> last two commits:
>> https://github.com/llvm/llvm-project/compare/HEAD^^...master
>> <https://github.com/llvm/llvm-project/compare/HEAD%5E...master> (two
>> dots for a direct diff, three for showing only changes since the branch
>> point). There's also some convenient additional tooling such as adding
>> ".patch" will download it as a raw patch file instead showing in the rich
>> editor.
>>
>> > There is no way to see previous version of the patch.
>> This is a fairly new feature, but the PR itself now lists these as events
>> (e.g. "user force-pushed from abcdefg to gfedcba", where each of those are
>> hyperlinks) in the comment feed.
>>
>> > how do i go back to previous revision
>> Answering a slightly different question in the hopes someone finds this
>> information useful: locally you can do `git reflog <branchname>` to get a
>> history of the state of the HEAD of any tag on that particular machine for
>> the last 90 days.
>>
>>
>> On Thu, Nov 7, 2019 at 8:10 AM Nemanja Ivanovic via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> I find Hal's analysis of this subject to be the most pragmatic one in
>>> the thread and really want to chime in with a big +1 for the balanced
>>> approach.
>>>
>>> TL; DR;
>>> As it was outlined by previous comments in the thread, the GitHub PR
>>> review tool is inferior to Phabricator (and I would argue that this is an
>>> objective statement). Those that are only familiar with GitHub may not
>>> ascribe much value to capabilities such as commenting on unchanged lines,
>>> expanding whatever context you're interested in, seeing the diff of two
>>> revisions, etc. But for a good portion of the LLVM community, these things
>>> are important.
>>>
>>> I think a well integrated solution would not place undue burden on the
>>> author or the reviewer. Allowing pull requests seems to be desired from the
>>> perspective of making it easy for authors. Allowing the review to happen on
>>> Phabricator seems to be desired from the perspective of making it easy for
>>> the reviewers. So a solution that can do both is the best of both worlds.
>>>
>>> One comment on the initial proposal is that it seems like what is
>>> proposed may take away the supposed benefit of moving to PR's in the first
>>> place - making it easy for the author. I fail to see how the workflow*:
>>> - Fork
>>> - Clone
>>> - Create branch and modify code
>>> - Push
>>> - Create PR
>>> - Squash+merge after approval
>>> - Delete the fork
>>>
>>> Is easier from the current:
>>> - Clone
>>> - Create branch and modify code
>>> - git diff -U99999
>>> - Post through web UI (or use arcanist for these two steps)
>>> - Apply the approved version and git push
>>>
>>> * There is a distinct possibility that I am misunderstanding how the new
>>> workflow would work and it would be easier than what I outlined above, but
>>> that's just the way I understand it.
>>>
>>> On Thu, Nov 7, 2019 at 6:42 AM Finkel, Hal J. via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> I think that it's really important that we try to strike some balance
>>>> here. Based on my experience, this thread, and offline conversations, two
>>>> things seem clear to me:
>>>>
>>>> 1. Overall, Phabricator is a superior tool for managing code reviews
>>>> and some related processes (although GitHub's tools certainly have some
>>>> benefits, and both are getting better over time).
>>>>
>>>> 2. Not accepting GitHub PRs forms a barrier to entry for casual
>>>> contributors, and perhaps, causes workflow-integration inefficiencies for
>>>> some others.
>>>>
>>>> I'm also deeply concerned about having another place to search for
>>>> historical data, track conversations (including the ability to cross-link),
>>>> and so on.
>>>>
>>>> I think that we should first explore the solution that Facebook uses or
>>>> used, see: https://github.com/facebook/hhvm/wiki/What-is-Phabricator
>>>>
>>>> They had an import system that, when a PR was submitted, would import
>>>> it into Phabriactor and post a reply providing an explanation and a link to
>>>> the imported differential. This might be the right balance between ease of
>>>> use for casual contributors and the needs of the more-regular contributors
>>>> providing the code review (and, likely, those actually committing changes).
>>>>
>>>> Other projects have worked on various kinds of integration schemes
>>>> (e.g., https://github.com/framawiki/Github-notif-bot) that we should
>>>> investigate. It seems that the Phabricator developers are also working on
>>>> some more-general API which better supports this kind of use case (e.g.,
>>>> https://secure.phabricator.com/T12739), although the status seems
>>>> unclear to me.
>>>>
>>>> -Hal
>>>> On 11/7/19 4:30 AM, James Henderson via llvm-dev wrote:
>>>>
>>>> Having been using Github internally for code reviews of private patches
>>>> on LLVM, and Phabricator for upstream ones, I've found the latter to be far
>>>> easier to use. Prior to working with LLVM, I had basically no experience
>>>> with either, so I'd say I'm coming from a fairly neutral starting point.
>>>> Here are some of my observations (I want to highlight 9 as a particular
>>>> issue, due to other recent discussions I've seen on the mailing list):
>>>>
>>>> 1) I don't know of a natural way to chain related patches together on
>>>> Github (aside from explicitly mentioning them), but as separate reviews.
>>>> This is useful for bigger features/refactorings where each individual step
>>>> provides some benefit, but seeing the bigger picture is easy. Phabricator
>>>> has the child/parent objects option.
>>>> 2) Phabricator allows placing comments anywhere in the diff context,
>>>> which is useful when the commit affects lines that haven't actually been
>>>> changed, and those lines need addressing/referencing in some way. Github
>>>> doesn't allow comments outside the immediate context around changed files
>>>> 3) As a +1 to Github on the other hand, commits always come with full
>>>> context, so you don't have to remind people to include it.
>>>> 4) Phabricator's ability to see what has changed since the previous
>>>> time you commented seems to be much more reliable than what Github provides.
>>>> 5) With a Github PR, if you want to include a minor change to the patch
>>>> prior to committing, you have to commit it to the PR, if you wish to use
>>>> the UI to do the merging. Of course, you could just not push the patch via
>>>> Github.
>>>> 6) I myself have on numerous occasions messed up my commit message when
>>>> committing via the Github UI, because it's not obvious unless you are a
>>>> seasoned user that the title of the commit appears as the first line of the
>>>> commit message. This is important when doing a squash and merge.
>>>> 7) The PR approach does at least allow committing via the UI, which is
>>>> perhaps a little less fiddly in some cases.
>>>> 8) I rarely bother creating a branch for Phabricator reviews, because I
>>>> don't need it (I'm often not working on multiple things at once), so the
>>>> extra hassle of creating a branch/checking it out etc that Github requires
>>>> is annoying. On the other hand, Github PRs are generally quite easy to
>>>> create once you have pushed a branch.
>>>> 9) On the note of branches for PRs, don't they require users to push
>>>> their local branches to the remote repo to create? That means we'll end up
>>>> thousands of branches in git. Not sure that this will do performance any
>>>> good, and I seem to remember there was general agreement that we didn't
>>>> want people to push their branches generally. Yes, in theory branches
>>>> should be deleted after they're merged, but I've seen that locally not
>>>> happen regularly, and that's even assuming that all PRs get merged in (they
>>>> won't).
>>>> 10) Expanding context on Github is a pain: there is no option to just
>>>> expand the whole context in a block, which means that if you need to see
>>>> something much earlier in a large file, you have to click a LOT. Also, the
>>>> browser view often doesn't then go where you expect it to from my
>>>> experience. Phabricator has a "expand all N lines" option.
>>>> 11) Small one this one, but missing new lines at end of file are much
>>>> more obvious on Phabricator than Github.
>>>> 12) Not sure if this is a real issue, but Github reviewers are limited
>>>> in number (I think it's 15?). To my knowledge, there is no such limit with
>>>> Phabricator (but then how often do you end up with 15 people marked as
>>>> reviewers?).
>>>>
>>>> I'm sure I could come up with other points for/against Github PRs. On
>>>> balance I definitely prefer Phabricator.
>>>>
>>>> James
>>>>
>>>> On Thu, 7 Nov 2019 at 09:53, Aaron Ballman via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> On Thu, Nov 7, 2019 at 3:09 AM Roman Lebedev via llvm-dev
>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>> >
>>>>> > Strong -1 personally.
>>>>>
>>>>> Likewise, for many of the same reasons detailed below.
>>>>>
>>>>> ~Aaron
>>>>>
>>>>> > * What is the endgoal? To fully kill phab and move to github
>>>>> pullrequests?
>>>>> > it might be best to discuss *that* first. (did i miss an RFC?)
>>>>> > * Separation of attention - does everyone who cares
>>>>> > now has to also look at pull requests for reviews;
>>>>> > or should they be exempt from general review attention?
>>>>> > * For me personally after using phabricator, github now seems
>>>>> > extremely crude, laggy, limited. To name a few:
>>>>> > * There is no way to see previous version of the patch.
>>>>> > I don't think there is any way to disable force-push for PR's.
>>>>> > While this is only 'slightly' limiting for the reviewer,
>>>>> > this can be more limiting for the author - how do i go back
>>>>> > to previous revision? I can't, i need to maintain a copy
>>>>> > of every branch i pushed manually.
>>>>> > * Impossible to chain reviews - a PR diff can only be made
>>>>> > on top of git master branch. Any non-trivial change consists of
>>>>> > separable PR's. Now either one will only be able to submit
>>>>> > dependent PR after the prereqs are merged, or the diff will be
>>>>> > impossible to read.
>>>>> > * Does not load large diffs by default.
>>>>> > That caught me by surprise several times
>>>>> > when i was searching for something.
>>>>> > * No diffs in mail - *super* inconvenient.
>>>>> > One has to open each pr in browser (or fetch via git etc etc)
>>>>> > * Github is an official US-based commercial organisation.
>>>>> > It *has* to follow U.S. export law. In particular i'm thinking of
>>>>> >
>>>>> https://techcrunch.com/2019/07/29/github-ban-sanctioned-countries/
>>>>> > https://github.com/tkashkin/GameHub/issues/289
>>>>> > Does phabricator already have such restrictions, blocks?
>>>>> > If not, wouldn't you say adding such restrictions is not being
>>>>> > more open for contributors?
>>>>> > What happens when, in not so long in the future, the entirety
>>>>> of, say,
>>>>> > china or russian federation is blocked as such?
>>>>> > * Same question can be asked about internet "iron" curtains
>>>>> > certain (*cough*) countries are raising. That also has already
>>>>> happened
>>>>> > before (and *will* happen again), and i was personally affected:
>>>>> > https://en.wikipedia.org/wiki/Censorship_of_GitHub#Russia
>>>>> > I don't recall that happening to phabricator yet.
>>>>> > I fail to see how that is more contributor-friendly.
>>>>> > * Not sure anyone cares, but while using github as main git
>>>>> > repository "mirror" is perfectly fine - git is distributed, only
>>>>> canonical
>>>>> > write-repo would be affected anything bad happen. But that isn't
>>>>> the case
>>>>> > for reviews, issues; as it has been discussed in the "let's
>>>>> migrate bugzilla
>>>>> > to github issues", it is far more involved.
>>>>> > * What about DMCA? Not sure how this is currently handled.
>>>>> > * UI feels laggy. Not much to add here, pretty subjective.
>>>>> > * I'm sure i'm missing a few.
>>>>> >
>>>>> > The github does come with it's benefits, sure:
>>>>> > * It is *simpler* to preserve git commit author.
>>>>> > Currently one has to ask the author for the "Author: e at ma.il"
>>>>> line,
>>>>> > and do `git commit --amend --author="<>"`.
>>>>> > * @mention is wide-r-reaching - whole github, not just llvm
>>>>> phabricator
>>>>> > * No more "phabricator disk full" issues
>>>>> > * ???
>>>>> >
>>>>> > TLDR: such migration lowers the bar for new, first time,
>>>>> > unestabilished contributors, but i personally feel it *significantly*
>>>>> > raises the bar for the existing contributors, reviewers.
>>>>> > We don't live in perfect world. Aspirational goals are aspirational.
>>>>> > They should be attempted to be reached, but they shouldn't shadow and
>>>>> > overweight, take over the main goal of the LLVM project.
>>>>> >
>>>>> > Personally, i don't see that benefits out-/over- weight the
>>>>> drawbacks.
>>>>> >
>>>>> > Roman.
>>>>> >
>>>>> > On Thu, Nov 7, 2019 at 8:32 AM Mehdi AMINI via llvm-dev
>>>>> > <llvm-dev at lists.llvm.org> wrote:
>>>>> > >
>>>>> > > Hi all,
>>>>> > >
>>>>> > > Now that we're on GitHub, we can discuss about pull-requests.
>>>>> > > I'd like to propose to enable pull-request on GitHub, as a first
>>>>> step as an experimental channel alongside the existing methods for
>>>>> contributing to LLVM.
>>>>> > > This would allow to find workflow issues and address them, and
>>>>> also LLVM contributors in general to start getting familiar with
>>>>> pull-requests without committing to switching to pull-requests immediately.
>>>>> The community should evaluate after a few months what would the next steps
>>>>> be.
>>>>> > >
>>>>> > > GitHub pull-requests is the natural way to contribute to project
>>>>> hosted on GitHub: this feature is so core to GitHub that there is no option
>>>>> to disable it!
>>>>> > >
>>>>> > > The current proposal is to allow to integrate contributions to the
>>>>> LLVM project directly from pull-requests. In particular the exact setup
>>>>> would be the following:
>>>>> > >
>>>>> > > - Contributors should use their own fork and push a branch in
>>>>> their fork.
>>>>> > > - Reviews can be performed on GitHub. The canonical tools are
>>>>> still the mailing-list and Phabricator: a reviewer can request the review
>>>>> to move to Phabricator.
>>>>> > > - The only option available will be to “squash and merge”. This
>>>>> mode of review matches the closest our current workflow (both phabricator
>>>>> and mailing-list): conceptually independent contributions belongs to
>>>>> separate review threads, and thus separate pull-requests.
>>>>> > > This also allow the round of reviews to not force-push the
>>>>> original branch and accumulate commits: this keeps the contextual history
>>>>> of comments and allow to visualize the incremental diff across revision of
>>>>> the pull-request.
>>>>> > > - Upon “merge” of a pull-request: history is linear and a single
>>>>> commit lands in master after review is completed.
>>>>> > >
>>>>> > > As an alternative staging proposal: we could enable pull-requests
>>>>> only for a small subset of sub-projects in LLVM (i.e. not LLVM/clang to
>>>>> begin with for example) in the repo. In this case, we would propose to
>>>>> begin with the MLIR project (as soon as it gets integrated in the
>>>>> monorepo). This would be a good candidate to be the guinea pig for this
>>>>> process since it does not yet have a wide established community of
>>>>> contributors, and the current contributors are already exclusively using
>>>>> pull-requests.
>>>>> > >
>>>>> > > Here is a more complete doc on the topic:
>>>>> https://docs.google.com/document/d/1DSHQrfydSjoqU9zEnj3rIcds6YN59Jxc37MdiggOyaI
>>>>> > >
>>>>> > > Cheers,
>>>>> > >
>>>>> > > --
>>>>> > > Mehdi
>>>>> > >
>>>>> > > _______________________________________________
>>>>> > > LLVM Developers mailing list
>>>>> > > llvm-dev at lists.llvm.org
>>>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> > _______________________________________________
>>>>> > LLVM Developers mailing list
>>>>> > llvm-dev at lists.llvm.org
>>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>> --
>>>> Hal Finkel
>>>> Lead, Compiler Technology and Programming Languages
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191108/93f056a0/attachment.html>
More information about the llvm-dev
mailing list