[LLVMdev] svn mirror git?

Fri Nov 16 13:53:12 PST 2012

LLVM Community,

> http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-July/041738.html

This was extraordinarily valuable in learning to understand the
situation - thank you David Blaikie for pointing me to it.

A few key snippets:

"Because I optimize for the code reviewer, not the patch submitter,"
Chris Lattner

"Forcing transitioning to git makes no sense for a lot of us - for
example, we have lots of scripts that depend on svn revision numbers,"
Jason Kim

"Let me say this again: We are not fundamentally changing the
development policy around LLVM," Chris Lattner

My interpretations, which later in this long email, I'll assume as
premises to a recommended action:

* Chris finds code reviewers to be exceptionally rare and the
community's most valuable participants.  My previous "spork"
suggestion would be a decision made my maintainers, not influenced by
patch-contributors, and would only happen if the maintainers felt the
transition made it easier for them to review and/or commit patches.

* Dropping SVN would be expensive for some.  Instead of dropping SVN,
it is more reasonable to make git the central repo and have SVN mirror
git.

* A linear history is highly valued by Chris and many members of the community.

My input (or from my perspective, my output):

In my humble opinion, there is a one biggest problem with git-svn and
svn.  It requires the maintainer to rebase before committing, and in
git, this changes the the patch's unique ID.  Changing the ID creates
a serious problem, one which forces the private fork to make an early
decision about contributing back to the community.  The private fork
must decide, "do we want this patch today or would we rather wait for
it to come in through a "fast-forward" of the community's repository?"
 If we choose to accept the patch locally, we have another decision to
make, "do we want to deal with merge conflicts after the patch makes
it through the community's review process, or should we just keep it
private and enjoy easy automatic merges until the community eventually
finds the same bug and redundantly makes a similar fix?"  I hope you
see this as not a Good Thing for the community.  The policy of
rebasing provides private forks incentive *not* to contribute patches.
 Please oh please, do not reply saying "but that's just selfish."  The
point I am hoping to illustrate is only that this incentive exists,
and it is a consequence of policy.

However, one could argue that the same policy, to always rebase,
provides incentive not to fork at all.  That is, it is easier to
contribute to the community than to make a private patch and risk
merge conflicts.  Indeed, but one problem, a fact of software:  The
private fork of any project will always and only exist as a mechanism
to meet functional requirements and/or schedule that do not align with
the official "mainline".  More concretely, if I have an upcoming
release planned and have a bug-fix that affects the correctness of the
compiler, I will most certainly add it to my private fork and not wait
on a community review.  At this point, I actually have incentive to
stop the code review process and hope the community never finds and
fixes my bug.  My life is easier when I choose not to contribute, and
this a direct consequence of the policy decision to rebase instead of
merge.

But rebasing is fundamental in providing a linear history, right?  I
question the validity of this popular argument, and argue this is just
a tooling issue.  The very fact that a rebase can often be achieved
automatically and without conflicts should send a strong signal that
this may be true.  At it turns out, the git object tree does encode a
linear history.  But this is not obvious!  "git log" makes an awkward
design decision in ordering commits by date.  Instead, I think it
should be ordered by merge, or specifically, a pre-order, depth-first
traversal of the commit tree.  I believe people care more about when
the patch entered their own repository than when the author made the
commit to his or hers.

Proposal: a slow, multistep, backward-compatible transition to remove
the disincentive to contribute patches from private forks:

Step 1: Demonstrate "git log" or a similar tool can produce a linear
history in the presence of merging.  This may already be possible.

Step 2: Swap the roles of git and svn.  Make svn the mirror and git
the central repository, and update the online documentation
accordingly.  In this step, do not change any policies and demand
anyone with commit access to maintain a linear history.  This
restriction is necessary for the svn mirror, but aims to give everyone
with svn-dependencies a strong hint that LLVM's use of svn is on its
way out.

Step 3: Once all svn automation dependencies have been dropped,
discontinue the svn mirror.  Relax the "always rebase" policy and ask
code-owners to start preferring merges to rebasing.

If the community is willing to make this transition, I commit to
coordinating a worldwide decentralized party celebrating our
successful move to decentralized version control.

Thank you for your time,
Greg Fitzgerald

P.S. tl;dr, right?