[PATCH] D112851: [PassManager] `buildModuleOptimizationPipeline()`: schedule `LoopDeletion` pass run before vectorization passes
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 29 15:40:21 PDT 2021
lebedev.ri created this revision.
lebedev.ri added reviewers: aeubanks, asbirlea, reames, mkazantsev, fhahn, jdoerfert, nikic.
lebedev.ri added a project: LLVM.
Herald added subscribers: ormris, wenlei, steven_wu, javed.absar, hiraditya.
lebedev.ri requested review of this revision.
Test thanks to Michael Kuklinski from `#llvm`: https://godbolt.org/z/bdrah5Goo
originally inspired by Daniel Lemire's https://lemire.me/blog/2021/10/26/in-c-is-empty-faster-than-comparing-the-size-with-zero/
We manage to deduce that the answer does not require looping,
but we do that after the last `LoopDeletion` pass run,
so we end up being stuck with a dead loop.
Now, as with all things SCEV, this has a very expected TBD compile time performance regression:
https://llvm-compile-time-tracker.com/compare.php?from=0ae7bf124a9bca76dd9a91b2f7379168ff13f562&to=c2ae57c9b961aeb4a28c747266949340613a6d84&stat=instructions
Looking at the transformation stats over vanilla test-suite, i think it's rather expected:
| statistic name | baseline | proposed | Δ | % | |%| |
|--------------------------------------------------|----------:|----------:|------:|-------:|-------:|
| scalar-evolution.NumBruteForceTripCountsComputed | 789 | 888 | 99 | 12.55% | 12.55% |
| scalar-evolution.NumTripCountsNotComputed | 105592 | 117900 | 12308 | 11.66% | 11.66% |
| loop-delete.NumBackedgesBroken | 542 | 559 | 17 | 3.14% | 3.14% |
| regalloc.numExtends | 81 | 79 | -2 | -2.47% | 2.47% |
| indvars.NumFoldedUser | 408 | 400 | -8 | -1.96% | 1.96% |
| indvars.NumElimCmp | 3831 | 3758 | -73 | -1.91% | 1.91% |
| scalar-evolution.NumTripCountsComputed | 299759 | 304278 | 4519 | 1.51% | 1.51% |
| loop-delete.NumDeleted | 8055 | 8128 | 73 | 0.91% | 0.91% |
| machine-cse.NumCommutes | 111 | 110 | -1 | -0.90% | 0.90% |
| globaldce.NumFunctions | 1187 | 1192 | 5 | 0.42% | 0.42% |
| codegenprepare.NumSelectsExpanded | 277 | 278 | 1 | 0.36% | 0.36% |
| loop-unroll.NumRuntimeUnrolled | 13841 | 13791 | -50 | -0.36% | 0.36% |
| machinelicm.NumPostRAHoisted | 1168 | 1172 | 4 | 0.34% | 0.34% |
| phi-node-elimination.NumCriticalEdgesSplit | 83054 | 82879 | -175 | -0.21% | 0.21% |
| machine-cse.NumPREs | 3085 | 3079 | -6 | -0.19% | 0.19% |
| branch-folder.NumBranchOpts | 108122 | 107942 | -180 | -0.17% | 0.17% |
| loop-unroll.NumUnrolled | 40136 | 40067 | -69 | -0.17% | 0.17% |
| branch-folder.NumDeadBlocks | 130818 | 130607 | -211 | -0.16% | 0.16% |
| codegenprepare.NumBlocksElim | 92856 | 92714 | -142 | -0.15% | 0.15% |
| instsimplify.NumSimplified | 103263 | 103129 | -134 | -0.13% | 0.13% |
| instcombine.NumConstProp | 26070 | 26102 | 32 | 0.12% | 0.12% |
| instsimplify.NumExpand | 1716 | 1718 | 2 | 0.12% | 0.12% |
| loop-unroll.NumCompletelyUnrolled | 9236 | 9225 | -11 | -0.12% | 0.12% |
| branch-folder.NumHoist | 2773 | 2770 | -3 | -0.11% | 0.11% |
| regalloc.NumReloadsRemoved | 10822 | 10834 | 12 | 0.11% | 0.11% |
| regalloc.NumSnippets | 11394 | 11406 | 12 | 0.11% | 0.11% |
| machine-cse.NumCrossBBCSEs | 1052 | 1053 | 1 | 0.10% | 0.10% |
| machinelicm.NumCSEed | 99887 | 99784 | -103 | -0.10% | 0.10% |
| branch-folder.NumTailMerge | 72501 | 72435 | -66 | -0.09% | 0.09% |
| codegenprepare.NumExtUses | 22007 | 21987 | -20 | -0.09% | 0.09% |
| local.NumRemoved | 68232 | 68294 | 62 | 0.09% | 0.09% |
| loop-vectorize.LoopsAnalyzed | 75483 | 75413 | -70 | -0.09% | 0.09% |
Note that i'm only changing current PM, and not touching obsolete PM.
This is an alternative to the function simplification pipeline variant of the same change, D112840 <https://reviews.llvm.org/D112840>.
It has both less compile time impact (since the additional number of SCEV trip count calculations
is way lass less than with the D112840 <https://reviews.llvm.org/D112840>), and it is much more powerful/impactful (almost 2x more loops deleted).
I have checked, and doing this after loop rotation is favorable (more loops deleted).
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D112851
Files:
llvm/lib/Passes/PassBuilderPipelines.cpp
llvm/test/Other/new-pm-defaults.ll
llvm/test/Other/new-pm-thinlto-defaults.ll
llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
llvm/test/Transforms/PhaseOrdering/deletion-of-loops-that-became-side-effect-free.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D112851.383516.patch
Type: text/x-patch
Size: 8862 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211029/30890a60/attachment.bin>
More information about the llvm-commits
mailing list