[PATCH] D22918: [Loop Vectorizer] Support predication of div/rem

Gil Rapaport via llvm-commits llvm-commits at lists.llvm.org
Sun Jul 31 08:19:04 PDT 2016


gilr added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3327
@@ -3321,1 +3326,3 @@
+      if (Predicate)
+        Cost += TTI.getCFInstrCost(Instruction::PHI);
       Cost += TTI.getVectorInstrCost(Instruction::InsertElement, Ty, I);
----------------
mkuper wrote:
> I'm not entirely sure what "the cost of a phi" means, especially without a type.
> 
> Also, I don't believe any in-tree target actually assigns a non-zero to PHIs right now. So if you actually want a meaningful cost here for, say, x86, you may want to look into the cost model as well...
So IIRC I took it from CostModelAnalysis::getInstructionCost(), but I'll remove it if it makes no sense.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3833
@@ -3788,3 +3832,3 @@
     I->moveBefore(T);
-    I->getParent()->setName("pred.store.if");
-    BB->setName("pred.store.continue");
+    // Try to move any extract-element we may have created for the predicated
+    // instruction into the Then block.
----------------
mkuper wrote:
> We don't run anything that will sink this later in the pipeline?
> (To be honest, even if we do, I'm not sure whether we should do the cleanup here or rely on a later pass, but I'm curious)
Actually the vectorizer currently seems to expect inst-combine to do so (see the deleted VEC-IC case in if-pred-stores.ll), but IINM the cost model didn't reflect that.
I agree, there's the general issue of generating efficient code here vs relying on later passes to clean up, which I think this was also brought up [[ https://llvm.org/bugs/show_bug.cgi?id=27881#c7 | here ]].

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3835
@@ +3834,3 @@
+    // instruction into the Then block.
+    for (unsigned In = 0; In < I->getNumOperands(); ++In) {
+      ExtractElementInst *OpInst =
----------------
mkuper wrote:
> Any reason not to use a range for over the operands?
No, will fix.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3841
@@ +3840,3 @@
+      bool CanSinkToUse = true;
+      for (User *U : OpInst->users()) {
+        if (U != &*I) {
----------------
mkuper wrote:
> Do you know if we have an isOnlyUserOf helper?
> I know we have one for SDNode... but, I couldn't find one in IR.
Me either - anyone?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3859
@@ +3858,3 @@
+      auto Users = I->users();
+      if (std::distance(Users.begin(), Users.end()) == 1 &&
+          isa<InsertElementInst>(*I->users().begin())) {
----------------
mkuper wrote:
> I->hasOneUse()? Or do you care specifically about the users() list? In any case, no need to compute std::distance.
Right, will fix.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3860
@@ +3859,3 @@
+      if (std::distance(Users.begin(), Users.end()) == 1 &&
+          isa<InsertElementInst>(*I->users().begin())) {
+        // If the predicated instruction is feeding an insert-element, move it
----------------
mkuper wrote:
> I->user_begin()?
Right, will fix.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4251
@@ +4250,3 @@
+    case Instruction::SRem:
+    case Instruction::FRem:
+      // Scalarize if block needs predication, otherwise fallthrough.
----------------
mkuper wrote:
> FRem looks out of place. Did you mean URem?
Yes.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4264
@@ -4167,5 +4263,3 @@
     case Instruction::FDiv:
     case Instruction::URem:
     case Instruction::Shl:
----------------
mkuper wrote:
> And, correspondingly, this should probably be FRem.
Indeed.


https://reviews.llvm.org/D22918





More information about the llvm-commits mailing list