[PATCH] D31175: Improve TargetTransformInfo::getCFInstrCost()

Matthew Simpson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 28 10:34:13 PDT 2017


mssimpso added inline comments.


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7219-7220
+    if (VF > 1 && BI->isConditional() &&
+        (PredicatedBBsAfterVectorization.count(BI->getSuccessor(0)) ||
+         PredicatedBBsAfterVectorization.count(BI->getSuccessor(1))))
+      ScalarPredicatedBB = true;
----------------
jonpa wrote:
> mssimpso wrote:
> > Can't we just use Legal->blockNeedsPredication() for the successors here instead of creating PredicatedBBsAfterVectorization. They should be the same, right?
> I don't think so since blockNeedsPredication() returns true for any block that does not dominate the Latch. This does not distinguish between blocks that will be if-converted or predicated. The block should be checked by checking each instruction in it with isScalarWithPredication(). Or am I missing something?
> 
Ah, you're right. I guess we'll need the new set then after all.


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7223-7228
+    if (ScalarPredicatedBB) {
+      // Return cost for branches around scalarized and predicated blocks.
+      Type *Vec_i1Ty =
+          VectorType::get(IntegerType::getInt1Ty(RetTy->getContext()), VF);
+      return (TTI.getScalarizationOverhead(Vec_i1Ty, false, true) +
+              (TTI.getCFInstrCost(Instruction::Br) * VF));
----------------
jonpa wrote:
> mssimpso wrote:
> > mssimpso wrote:
> > > We should probably also include the cost of the unconditional branches inside the predicated blocks. This would amount to another factor of VF * TTI.getCFInstrCost(Instruction::Br) / getReciprocalPredBlockProb() for the ScalarPredicatedBB case.
> > Regarding my previous comment, this should probably be a separate case. Something like:
> > 
> > ```
> > if (VF > 1 && Legal->blockNeedsPredication(BI->getParent())
> >   return  VF * TTI.getCFInstrCost(Instruction::Br) / getReciprocalPredBlockProb();
> > ```
> > We should probably also include the cost of the unconditional branches inside the predicated blocks.
> I don't understand this, given that a predicated block will fall-through to its successor?
> 
> I also don't see why we should divide the *branch* cost with the reciprocal of the probability of the block being executed...? That should only be done for the predicated block cost, right? The extracts of i1's is always done.
> 
> I don't understand this, given that a predicated block will fall-through to its successor?

This is probably true, but it's something I think the TTI implementation might want to worry about, not the vectorizer. The division can be fuzzy, but it's fair to ask TTI about any instruction that will exist in the IR. It's fine for TTI to return 0 (which the default implementation currently does for all branches anyway).

> I also don't see why we should divide the *branch* cost with the reciprocal of the probability of the block being executed...? That should only be done for the predicated block cost, right?

I'm not sure I completely understand your question. But I was referring to the branch inside a predicated block (a block contained in PredicatedBBsAfterVectorization). I may have been unclear. This would be a fourth case to consider, in addition to the three you already have (which look fine to me): back-edge, if-converted, conditional branch to a predicated block.

I'm saying that this branch can be treated similar to the instructions that are isScalarWithPredication regarding the cost calculation. The cost of an instruction in a block in PredicatedBBsAfterVectorization should be scaled by the block probability for VF > 1, since the block may not be executed. The VF == 1 case is handled all at once in expectedCost() where we scale the cost of the entire block (which includes the cost of the branch) by block probability. It's OK to use Legal->blockNeedsPredication() there since we're not if-converting. So this would ensure that the scalar and vector cases are computed in the same way. Does this make sense?


https://reviews.llvm.org/D31175





More information about the llvm-commits mailing list