[llvm-dev] [cfe-dev] [Github] RFC: linear history vs merge commits

Fri Feb 1 14:48:32 PST 2019

On Fri, Feb 01, 2019 at 12:41:05PM -0500, Arthur O'Dwyer via cfe-dev wrote:
> I know GitHub can be configured to reject force-pushes to {any, a
> specific} branch.  I strongly suspect that if *the maintainers of
> LLVM* asked nicely, GitHub would also be able to reject
> merges to {any, a specific} branch.

Anyone with admin permissions in a repo can add "branch protection
rules" that prevent force pushes, there is no need to contact Github
support for that.

> - GitHub PRs have Travis integration so the reviewer can see if a
> particular patch is actually compileable at all, before spending time
> looking at it. I wish Phab had this feature (or maybe it exists and LLVM
> doesn't use it?).

This kind of pre-merge testing is very valuable, it could potentially
prevent some unnecessary reverts by catching issues early.

On caveat is that the test coverage would be limited or else the build
times would be very long. There are quite some builders for various
projects (llvm, cfe, compiler-rt, etc.) on a lot of different platforms
and targets (Linux, macOS, Windows, Android, etc.).

With limited resources, Arthur's suggestion seems workable to me:
- Enable pre-commit testing of few configurations (in parallel).
- Encourage developers to wait for tests to pass before pushing changes.
- Do not block hard to avoid a situation where unrelated changes
  (commits or build environment) cause failures.

I would also be in favor of linear history for reasons mentioned before
(easier bisection, more readable logs). Though whatever choice happens
(cherry-pick vs merge), I think it is important to keep the link between
a change and the review. I have seen various strategies:

- Phabricator: manually include a "Differential revision" link in the
  commit message.
- Github (merge via web interface): automatically includes a reference
  to the "Pull Request".
- Github (cherry-pick / rebase and merge with no merge commit):
  unfortunately loses the review information.
- Gerrit: developers cannot merge directly, instead they use the web
  interface to submit a change. This will add a "Reviewed-on" link that
  references the review. (Used by Wireshark, Qt, boringssl, etc.)

Projects that are use mailing lists to review patches (like Linux and
QEMU) commonly include a Message-Id tag in the commit message that
references the original mailing list discussion.

The curl project also uses Github for reviews, but encourages developers
with push permissions to manually fetch the commit, edit the commit
message to reference the review and then push to the master branch (with
no merge commits). Again, this is a manual process and new developers
who are not familiar with this process accidentally create a merge
commit anyway (or forget to reference the review).

Ideally changes go through a single gatekeeper, a tool that recognizes
comments to a merge request and subsequently tries to add extra info to
a commit message (link to a review) and enforce a linear history (by
cherry-picking changes or rebase + merge commits). This tool could be
triggered by posting comments like "@name-of-the-bot merge this".
(If necessary this could be combined with requiring tests to pass.)

Such an extra indirection with a single gatekeeper could be a quite
drastic change from the current development model though. It could
reduce the number of broken builds and improve the quality of commit
messages though.

Just my two cents, hope it helps :)
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl