[PATCH] D31175: Improve TargetTransformInfo::getCFInstrCost()

Sat Mar 25 00:50:25 PDT 2017

jonpa added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7219-7220
+    if (VF > 1 && BI->isConditional() &&
+        (PredicatedBBsAfterVectorization.count(BI->getSuccessor(0)) ||
+         PredicatedBBsAfterVectorization.count(BI->getSuccessor(1))))
+      ScalarPredicatedBB = true;
----------------
mssimpso wrote:
> Can't we just use Legal->blockNeedsPredication() for the successors here instead of creating PredicatedBBsAfterVectorization. They should be the same, right?
I don't think so since blockNeedsPredication() returns true for any block that does not dominate the Latch. This does not distinguish between blocks that will be if-converted or predicated. The block should be checked by checking each instruction in it with isScalarWithPredication(). Or am I missing something?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7223-7228
+    if (ScalarPredicatedBB) {
+      // Return cost for branches around scalarized and predicated blocks.
+      Type *Vec_i1Ty =
+          VectorType::get(IntegerType::getInt1Ty(RetTy->getContext()), VF);
+      return (TTI.getScalarizationOverhead(Vec_i1Ty, false, true) +
+              (TTI.getCFInstrCost(Instruction::Br) * VF));
----------------
mssimpso wrote:
> mssimpso wrote:
> > We should probably also include the cost of the unconditional branches inside the predicated blocks. This would amount to another factor of VF * TTI.getCFInstrCost(Instruction::Br) / getReciprocalPredBlockProb() for the ScalarPredicatedBB case.
> Regarding my previous comment, this should probably be a separate case. Something like:
> 
> ```
> if (VF > 1 && Legal->blockNeedsPredication(BI->getParent())
>   return  VF * TTI.getCFInstrCost(Instruction::Br) / getReciprocalPredBlockProb();
> ```
> We should probably also include the cost of the unconditional branches inside the predicated blocks.
I don't understand this, given that a predicated block will fall-through to its successor?

I also don't see why we should divide the *branch* cost with the reciprocal of the probability of the block being executed...? That should only be done for the predicated block cost, right? The extracts of i1's is always done.

https://reviews.llvm.org/D31175