[flang-dev] Rewriting f18's history for inclusion in llvm monorepo (third attempt, C rewrite)
Peter Waller via flang-dev
flang-dev at lists.llvm.org
Tue Dec 17 12:52:23 PST 2019
Hi All,
A third attempt, following feedback and study.
There were issues with the shell script leading to surprising trees and
generating the Original-commit trailer which I found easier to
workaround by using the lower-level C api provided by libgit2. If you
want to see the script please take a look at the pull request:
https://github.com/flang-compiler/f18/pull/854 - I warn you, it's ugly!
The old quote, "I wanted to write a shorter program, but I didn't have
the time" comes to mind :).
Now there is a linear history, keeping the empty merge commits. The
commits rewrite the content under the flang/ directory and take the
current llvm-project master branch as the parent for (what was) the root
commit. This is something that can in principle be pushed to
llvm-project, assuming everyone (and llvm-dev) are all happy.
=== Key links:
* Tree, merged with LLVM:
https://github.com/peterwaller-arm/f18/tree/rewritten-history-v2-llvm-project-merge
* Rewritten history:
https://github.com/peterwaller-arm/f18/commits/rewritten-history-v2-llvm-project-merge
* Rewritten history without llvm merge:
https://github.com/peterwaller-arm/f18/commits/rewritten-history-v2
* Link to the program pull request:
https://github.com/flang-compiler/f18/pull/854
=== Next steps:
* I understand that the flang community would like to push this into
upstream before the llvm-10 branch in mid-January.
* I'll email llvm-dev to solicit feedback with the intent that we would
like to do this in the near future.
* Modulo any feedback from this email or llvm-dev, I believe it's ready
to go. It just requires someone to follow the steps, run the script, and
push the resulting branch onto llvm-project.
* When we're ready to pull the trigger, I think we should:
* permanently stop accepting commits on flang-compiler/f18, and
redirect those commits to llvm-project.
* run the rewrite script
* verify the rewrite (which should be fairly easily)
* push the new history into llvm-project.
===
More detail follows for anyone interested.
=== Features:
* Commits are now prefixed with [flang-compiler/f18#PRNUMBER] to
indicate the pull request, if available, the commit was merged in.
* Issue/PR references are rewritten as flang-compiler/f18#NUMBER,
according to github's convention for cross-repository references.
* Empty merge commits are now kept, so that the pull request commit
message (which usually includes the pull request title) is present in
the lineage.
* Original-commit: trailer header shows the pre-rewrite commit sha.
* Reviewed-on: trailer links to flang-compiler/f18 pull request for the
merge commit which pulled the merge in.
* Manual rebases can be taken from branches named rebase-{12 digit merge
sha}, if they are present.
* If the remote branch llvm-project/master is available, then it also
rewrites the commits under the flang/ directory with the latest
llvm-project master as the parent of the first flang commit.
* If you want to run it yourself, it takes 3 seconds to compile and 3
seconds to run.
The program generates links and references to commit shas in
https://github.com/flang-compiler/f18 under the assumption that it will
continue to exist, or get renamed, and if it were renamed that github's
rename functionality with do the right thing assuming that the f18 name
is not reused for a different repository.
=== Result:
* I've pushed a sample rewritten history with the rewrite up to my
personal github repository. At time of writing it contains 2,721 commits.
* I believe the resulting history is suitable to be pushed onto
llvm-project.
* I've done a best-effort sanity check that there are no significant
differences introduced in the rewriting. There may be some minor
differences on branch commits (and some branch commits may not compile
anymore where they once did), but I have high confidence that the merge
commits are equivalent.
* I've done the easy manual rebases on a best-effort basis. There are
only 3 rebases left which weren't "easy". This results in some commits
which don't have the same checkout (and therefore may not compile for
example), but the script ensures that by the time of the merge commit,
there are no differences. Many on-branch commits are "the same", if no
other commits happened on the master branch during the feature branch.
* 110 commits have been dropped from history: 45 now-empty
feature-branch merges, and 27 got squashed, and 38 discarded.
* Please note that because patches have been rebased, they aren't what
authors originally published, especially if it required a manual rebase.
Any mistake made during the rebase looks as though it is attributed to
the author. Hopefully the Original-commit provides a clear reference to
the ground-truth of what the author originally did.
=== Validation:
I've done the best I can to ensure the history is as faithful as it can
be. Please take a look for yourself and see what it looks like. I
believe with a reasonable amount of confidence that the checkouts are
the same at the merge commits, which is the key promise.
* Feature branch patch deltas: There is a shell script included in the
comment at the top of the program which enables looking at the
diff-to-patches (yes, diffs-of-diffs) from the rebase. To use, set
use_original_message = true. Mostly I see context changes, and a little
bit of fall out from the rebasing which doesn't look too concerning to me.
* The merge-commit promise: If you do `git log --format="%T %s"`
--reverse --first-parent origin/master > A && git log --format="%T %s"
rewritten-history-v2 > B` and run `git diff --word-diff --no-index A B`
to compare the two, you can see that all merge commits have identical
trees, which is the key promise. You can also get a feel for how often
commits end up being the same before and after rewrite.
* I've verified that my name does not appear on any commits (and not as
the committer) as a consequence of history rewriting.
=== Other hints:
If anyone wants to have a go at doing the remaining 3 rebases, run one
of these lines, do the rebase, and verify that at the end of the rebase
"git diff $M" is empty. Then push the branch somewhere and let me know
about it.
M=d341464e7ffd; git checkout -B rebase-${M} ${M}^2; git rebase ${M}^1
# PR #137, 6 commits, author hsuauthai
M=a24701e31301; git checkout -B rebase-${M} ${M}^2; git rebase ${M}^1
# PR #539, 13 commits, author Tim Keithe new root committh
M=24856b82387a; git checkout -B rebase-${M} ${M}^2; git rebase ${M}^1
# PR #544, 8 commits, author jeanPerier
The following link showshow those merge commits appear (squashed) in the
history, if that doesn't happen:
https://gist.github.com/peterwaller-arm/e7920634ccd0e0b440824663e28b4aa7.
Rebasing is a grungy thing to do, but at least we know that the
checkouts are the same at the merge commits. The only alternative I'm
aware of is to squash the second-parent history of the merge commit.
If you want to reproduce the same rewritten history as I've published
it, you'll need to add my fork as a remote, fetch my rebase branches,
and and create them with something like `for ref in $(git for-each-ref
--format='%(refname)' 'refs/remotes/peterwaller-arm/rebase-*' | xargs
-n1 basename); do checkout $ref; done`. If you try to reproduce and fail
let me know. The script should be reproducible. If the rebase branches
make someone unhappy, it is easy enough to fall back to simply squashing.
Regards,
- Peter
More information about the flang-dev
mailing list