[PATCH] D22918: [Loop Vectorizer] Support predication of div/rem

Adam Nemet via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 17 11:58:07 PDT 2016


anemet added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4070-4096
@@ +4069,29 @@
+
+  // For each instruction I marked for predication on value C, split I into its
+  // own basic block to form an if-then construct over C.
+  // Since I may be fed by extractelement and/or be feeding an insertelement
+  // generated during scalarization we try to move such instructions into the
+  // predicated basic block as well. For the insertelement this also means that
+  // the PHI will be created for the resulting vector rather than for the
+  // scalar instruction. So for some scalarized instruction, e.g.
+  //
+  //     %34 = extractelement <2 x i32> %26, i32 0
+  //     %35 = extractelement <2 x i32> %wide.load, i32 0
+  //     %36 = sdiv i32 %34, %35
+  //     %37 = insertelement <2 x i32> undef, i32 %36, i32 0
+  //
+  // predication typically yields:
+  //
+  //     %33 = icmp eq i1 %32, true
+  //     br i1 %33, label %pred.sdiv.if, label %pred.sdiv.continue
+  //
+  //   pred.sdiv.if:                                     ; preds = %vector.body
+  //     %34 = extractelement <2 x i32> %26, i32 0
+  //     %35 = extractelement <2 x i32> %wide.load, i32 0
+  //     %36 = sdiv i32 %34, %35
+  //     %37 = insertelement <2 x i32> undef, i32 %36, i32 0
+  //     br label %pred.sdiv.continue
+  //
+  //   pred.sdiv.continue:    ; preds = %pred.sdiv.if, %vector.body
+  //     %38 = phi <2 x i32> [ undef, %vector.body ], [ %37, %pred.sdiv.if ]
+
----------------
I would also include the select instruction in this excerpt (omitting anything in between with a ...).  Then you can explain in the initial comment that the value produced on the false branch is not used (the conditional execution is only reintroduced to avoid side-effects).

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4076
@@ +4075,3 @@
+  // the PHI will be created for the resulting vector rather than for the
+  // scalar instruction. So for some scalarized instruction, e.g.
+  //
----------------
"So for the first element of a scalarized instruction, e.g."

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4111-4118
@@ +4110,10 @@
+        continue;
+      bool CanSinkToUse = true;
+      for (User *U : OpInst->users()) {
+        if (U != &*I) {
+          // The extractelement is feeding another instruction - give up.
+          CanSinkToUse = false;
+          break;
+        }
+      }
+      if (CanSinkToUse)
----------------
OK, is this difference relevant?  If yes, add a helper either in this module or at a more global place and use it.

You can probably also writes this with std::all_of or something.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4141
@@ +4140,3 @@
+        IncomingTrue = &*I;
+        IncomingFalse = UndefValue::get(I->getType());
+      }
----------------
> We are re-introducing the original scalar conditional execution of an instruction here. This undef can reach either

Makes sense but we need a comment for this.  I think the best is to explain this on the excerpt at the beginning, see my comment there.

================
Comment at: test/Transforms/LoopVectorize/if-pred-non-void.ll:8-9
@@ +7,4 @@
+; instructions permit vectorization and (ii) the creation of an insertelement
+; and a Phi node. For each predicated instruction we search for the code
+; generated for the first element.
+define void @test(i32* nocapture %asd, i32* nocapture %aud,
----------------
As a demo for how this works it would be actually good to include at least one of the second element sequences as well.


================
Comment at: test/Transforms/LoopVectorize/if-pred-non-void.ll:20-21
@@ +19,4 @@
+; CHECK: vector.body:
+; CHECK:   %{{.*}} = extractelement <2 x i1> %{{.*}}, i32 0
+; CHECK:   br i1 %{{.*}}, label %[[CSD:[a-zA-Z0-9.]+]], label %[[ESD:[a-zA-Z0-9.]+]]
+; CHECK: [[CSD]]:
----------------
Can you please name and match this extract as well, it helps reading.  Everywhere in these tests.

================
Comment at: test/Transforms/LoopVectorize/if-pred-non-void.ll:22-23
@@ +21,4 @@
+; CHECK:   br i1 %{{.*}}, label %[[CSD:[a-zA-Z0-9.]+]], label %[[ESD:[a-zA-Z0-9.]+]]
+; CHECK: [[CSD]]:
+; CHECK:   %[[SD0:[a-zA-Z0-9]+]] = sdiv i32 %{{.*}}, %{{.*}}
+; CHECK:   %[[SD1:[a-zA-Z0-9]+]] = insertelement <2 x i32> undef, i32 %[[SD0]], i32 0
----------------
It would be also good to check the extractelements feeding the divs here.


https://reviews.llvm.org/D22918





More information about the llvm-commits mailing list