[PATCH] D22918: [Loop Vectorizer] Support predication of div/rem
Adam Nemet via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 17 11:58:07 PDT 2016
anemet added inline comments.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4070-4096
@@ +4069,29 @@
+
+ // For each instruction I marked for predication on value C, split I into its
+ // own basic block to form an if-then construct over C.
+ // Since I may be fed by extractelement and/or be feeding an insertelement
+ // generated during scalarization we try to move such instructions into the
+ // predicated basic block as well. For the insertelement this also means that
+ // the PHI will be created for the resulting vector rather than for the
+ // scalar instruction. So for some scalarized instruction, e.g.
+ //
+ // %34 = extractelement <2 x i32> %26, i32 0
+ // %35 = extractelement <2 x i32> %wide.load, i32 0
+ // %36 = sdiv i32 %34, %35
+ // %37 = insertelement <2 x i32> undef, i32 %36, i32 0
+ //
+ // predication typically yields:
+ //
+ // %33 = icmp eq i1 %32, true
+ // br i1 %33, label %pred.sdiv.if, label %pred.sdiv.continue
+ //
+ // pred.sdiv.if: ; preds = %vector.body
+ // %34 = extractelement <2 x i32> %26, i32 0
+ // %35 = extractelement <2 x i32> %wide.load, i32 0
+ // %36 = sdiv i32 %34, %35
+ // %37 = insertelement <2 x i32> undef, i32 %36, i32 0
+ // br label %pred.sdiv.continue
+ //
+ // pred.sdiv.continue: ; preds = %pred.sdiv.if, %vector.body
+ // %38 = phi <2 x i32> [ undef, %vector.body ], [ %37, %pred.sdiv.if ]
+
----------------
I would also include the select instruction in this excerpt (omitting anything in between with a ...). Then you can explain in the initial comment that the value produced on the false branch is not used (the conditional execution is only reintroduced to avoid side-effects).
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4076
@@ +4075,3 @@
+ // the PHI will be created for the resulting vector rather than for the
+ // scalar instruction. So for some scalarized instruction, e.g.
+ //
----------------
"So for the first element of a scalarized instruction, e.g."
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4111-4118
@@ +4110,10 @@
+ continue;
+ bool CanSinkToUse = true;
+ for (User *U : OpInst->users()) {
+ if (U != &*I) {
+ // The extractelement is feeding another instruction - give up.
+ CanSinkToUse = false;
+ break;
+ }
+ }
+ if (CanSinkToUse)
----------------
OK, is this difference relevant? If yes, add a helper either in this module or at a more global place and use it.
You can probably also writes this with std::all_of or something.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4141
@@ +4140,3 @@
+ IncomingTrue = &*I;
+ IncomingFalse = UndefValue::get(I->getType());
+ }
----------------
> We are re-introducing the original scalar conditional execution of an instruction here. This undef can reach either
Makes sense but we need a comment for this. I think the best is to explain this on the excerpt at the beginning, see my comment there.
================
Comment at: test/Transforms/LoopVectorize/if-pred-non-void.ll:8-9
@@ +7,4 @@
+; instructions permit vectorization and (ii) the creation of an insertelement
+; and a Phi node. For each predicated instruction we search for the code
+; generated for the first element.
+define void @test(i32* nocapture %asd, i32* nocapture %aud,
----------------
As a demo for how this works it would be actually good to include at least one of the second element sequences as well.
================
Comment at: test/Transforms/LoopVectorize/if-pred-non-void.ll:20-21
@@ +19,4 @@
+; CHECK: vector.body:
+; CHECK: %{{.*}} = extractelement <2 x i1> %{{.*}}, i32 0
+; CHECK: br i1 %{{.*}}, label %[[CSD:[a-zA-Z0-9.]+]], label %[[ESD:[a-zA-Z0-9.]+]]
+; CHECK: [[CSD]]:
----------------
Can you please name and match this extract as well, it helps reading. Everywhere in these tests.
================
Comment at: test/Transforms/LoopVectorize/if-pred-non-void.ll:22-23
@@ +21,4 @@
+; CHECK: br i1 %{{.*}}, label %[[CSD:[a-zA-Z0-9.]+]], label %[[ESD:[a-zA-Z0-9.]+]]
+; CHECK: [[CSD]]:
+; CHECK: %[[SD0:[a-zA-Z0-9]+]] = sdiv i32 %{{.*}}, %{{.*}}
+; CHECK: %[[SD1:[a-zA-Z0-9]+]] = insertelement <2 x i32> undef, i32 %[[SD0]], i32 0
----------------
It would be also good to check the extractelements feeding the divs here.
https://reviews.llvm.org/D22918
More information about the llvm-commits
mailing list