[PATCH] D88471: [Passes] Run peeling as part of simple/full loop unrolling.
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 29 02:18:46 PDT 2020
fhahn created this revision.
fhahn added reviewers: efriedma, sanjoy, lebedev.ri, reames.
Herald added subscribers: dexonsmith, zzheng, hiraditya.
Herald added a reviewer: rengolin.
Herald added a project: LLVM.
fhahn requested review of this revision.
Loop peeling removes conditions from loop bodies that become invariant
after a small number of iterations. When triggered, this leads to fewer
compares and possibly PHIs in loop bodies, enabling further
optimizations. The current cost-model of loop peeling should be quite
conservative/safe, i.e. only peel if a condition in the loop becomes
known after peeling.
For example, see PR47671, where loop peeling enables vectorization by
removing a PHI the vectorizer does not understand. Granted, the
loop-vectorizer could also be taught about constant PHIs, but loop
peeling is likely to enable other optimizations as well.
This has an impact on quite a few benchmarks from
MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto, for example
Same hash: 186 (filtered out)
Remaining: 51
Metric: loop-vectorize.LoopsVectorized
Program base patch diff
test-suite...ve-susan/automotive-susan.test 8.00 9.00 12.5%
test-suite...nal/skidmarks10/skidmarks.test 35.00 31.00 -11.4%
test-suite...lications/sqlite3/sqlite3.test 41.00 43.00 4.9%
test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test 25.00 26.00 4.0%
test-suite...006/450.soplex/450.soplex.test 88.00 89.00 1.1%
test-suite...TimberWolfMC/timberwolfmc.test 120.00 119.00 -0.8%
test-suite.../CINT2006/403.gcc/403.gcc.test 215.00 216.00 0.5%
test-suite...006/447.dealII/447.dealII.test 957.00 958.00 0.1%
test-suite...ternal/HMMER/hmmcalibrate.test 75.00 75.00 0.0%
Same hash: 186 (filtered out)
Remaining: 51
Metric: loop-vectorize.LoopsAnalyzed
Program base patch diff
test-suite...ks/Prolangs-C/agrep/agrep.test 440.00 434.00 -1.4%
test-suite...nal/skidmarks10/skidmarks.test 312.00 308.00 -1.3%
test-suite...marks/7zip/7zip-benchmark.test 6399.00 6323.00 -1.2%
test-suite...lications/minisat/minisat.test 134.00 135.00 0.7%
test-suite...rks/FreeBench/pifft/pifft.test 295.00 297.00 0.7%
test-suite...TimberWolfMC/timberwolfmc.test 1879.00 1869.00 -0.5%
test-suite...pplications/treecc/treecc.test 689.00 691.00 0.3%
test-suite...T2000/300.twolf/300.twolf.test 1593.00 1597.00 0.3%
test-suite.../Benchmarks/Bullet/bullet.test 1394.00 1392.00 -0.1%
test-suite...ications/JM/ldecod/ldecod.test 1431.00 1429.00 -0.1%
test-suite...6/464.h264ref/464.h264ref.test 2229.00 2230.00 0.0%
test-suite...lications/sqlite3/sqlite3.test 2590.00 2589.00 -0.0%
test-suite...ications/JM/lencod/lencod.test 2732.00 2733.00 0.0%
test-suite...006/453.povray/453.povray.test 3395.00 3394.00 -0.0%
Note the -11% regression in number of loops vectorized for skidmarks. I
suspect this corresponds to the fact that those loops are gone now (see
the reduction in number of loops analyzed by LV).
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D88471
Files:
llvm/include/llvm/Transforms/Scalar.h
llvm/include/llvm/Transforms/Scalar/LoopUnrollPass.h
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
llvm/test/Transforms/PhaseOrdering/X86/peel-before-lv-to-enable-vectorization.ll
Index: llvm/test/Transforms/PhaseOrdering/X86/peel-before-lv-to-enable-vectorization.ll
===================================================================
--- llvm/test/Transforms/PhaseOrdering/X86/peel-before-lv-to-enable-vectorization.ll
+++ llvm/test/Transforms/PhaseOrdering/X86/peel-before-lv-to-enable-vectorization.ll
@@ -8,7 +8,9 @@
; before loop vectorization.
define i32 @test(i32* readonly %p, i32* readnone %q) {
; CHECK-LABEL: define i32 @test(
-; CHECK-NOT: vector.body
+; CHECK: vector.body:
+; CHECK: %index.next = add i64 %index, 8
+; CHECK: middle.block:
;
entry:
%cmp.not7 = icmp eq i32* %p, %q
Index: llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
===================================================================
--- llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
+++ llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
@@ -1299,7 +1299,7 @@
Pass *llvm::createSimpleLoopUnrollPass(int OptLevel, bool OnlyWhenForced,
bool ForgetAllSCEV) {
return createLoopUnrollPass(OptLevel, OnlyWhenForced, ForgetAllSCEV, -1, -1,
- 0, 0, 0, 0);
+ 0, 0, 0, 1);
}
PreservedAnalyses LoopFullUnrollPass::run(Loop &L, LoopAnalysisManager &AM,
@@ -1327,7 +1327,7 @@
OnlyWhenForced, ForgetSCEV, /*Count*/ None,
/*Threshold*/ None, /*AllowPartial*/ false,
/*Runtime*/ false, /*UpperBound*/ false,
- /*AllowPeeling*/ false,
+ /*AllowPeeling*/ true,
/*AllowProfileBasedPeeling*/ false,
/*FullUnrollMaxCount*/ None) !=
LoopUnrollResult::Unmodified;
Index: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
===================================================================
--- llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -445,7 +445,7 @@
if (EnableLoopInterchange)
MPM.add(createLoopInterchangePass()); // Interchange loops
- // Unroll small loops
+ // Unroll small loops and perform peeling.
MPM.add(createSimpleLoopUnrollPass(OptLevel, DisableUnrollLoops,
ForgetAllSCEVInLoopUnroll));
addExtensionsToPM(EP_LoopOptimizerEnd, MPM);
@@ -1036,7 +1036,7 @@
if (EnableLoopInterchange)
PM.add(createLoopInterchangePass());
- // Unroll small loops
+ // Unroll small loops and perform peeling.
PM.add(createSimpleLoopUnrollPass(OptLevel, DisableUnrollLoops,
ForgetAllSCEVInLoopUnroll));
PM.add(createLoopVectorizePass(true, !LoopVectorize));
Index: llvm/include/llvm/Transforms/Scalar/LoopUnrollPass.h
===================================================================
--- llvm/include/llvm/Transforms/Scalar/LoopUnrollPass.h
+++ llvm/include/llvm/Transforms/Scalar/LoopUnrollPass.h
@@ -22,7 +22,7 @@
class Loop;
class LPMUpdater;
-/// Loop unroll pass that only does full loop unrolling.
+/// Loop unroll pass that only does full loop unrolling and peeling.
class LoopFullUnrollPass : public PassInfoMixin<LoopFullUnrollPass> {
const int OptLevel;
Index: llvm/include/llvm/Transforms/Scalar.h
===================================================================
--- llvm/include/llvm/Transforms/Scalar.h
+++ llvm/include/llvm/Transforms/Scalar.h
@@ -178,7 +178,8 @@
int Count = -1, int AllowPartial = -1,
int Runtime = -1, int UpperBound = -1,
int AllowPeeling = -1);
-// Create an unrolling pass for full unrolling that uses exact trip count only.
+// Create an unrolling pass for full unrolling that uses exact trip count only
+// and also does peeling.
Pass *createSimpleLoopUnrollPass(int OptLevel = 2, bool OnlyWhenForced = false,
bool ForgetAllSCEV = false);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D88471.294912.patch
Type: text/x-patch
Size: 4014 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200929/711a65e2/attachment.bin>
More information about the llvm-commits
mailing list