[PATCH] D92178: [InstructionCost] Change LoopVectorizationCostModel::getInstructionCost to return InstructionCost

Wed Dec 2 06:29:19 PST 2020

david-arm added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5441
     VectorizationCostTy C = expectedCost(ElementCount::getFixed(i));
-    std::pair<unsigned, unsigned> VectorCost = {C.first, i};
+    std::pair<InstructionCost, unsigned> VectorCost = {C.first, i};
     LLVM_DEBUG(dbgs() << "LV: Vector loop of width " << i
----------------
ctetreau wrote:
> david-arm wrote:
> > ctetreau wrote:
> > > I think you can get rid of these pairs if you just check validity before assigning it.
> > OK. I guess the problem is that we could start off with an invalid cost (ScalarCost) and get another Invalid cost in the loop too. I think we could get rid of the pairs, but it just makes the main loop more complicated that's all. What I could do is add this in the main loop:
> > 
> >   if (!C.first.isValid())
> >     continue
> >   if (!ScalarCost.isValid())
> >     MinCost = C.first;
> >   // Both costs are now valid.
> >   // ... use MinCost and C somehow ...
> > 
> > The other thing I was trying to do is move from using a float to represent the MinCost in the loop, to using an InstructionCost (integer based) instead. If I get rid of the pairs then it makes more sense to revert to using a float for MinCost I think, since then division (and hence a fractional cost) is involved.
> > 
> > I can try adding control flow to the loop like above?
> So I messed around with your patch, and came up with this loop body:
> 
> ```
> for (unsigned i = 2; i <= MaxVF.getFixedValue(); i *= 2) {
>     // Notice that the vector loop needs to be executed less times, so
>     // we need to divide the cost of the vector loops by the width of
>     // the vector elements.
>     VectorizationCostTy C = expectedCost(ElementCount::getFixed(i));
> 
>     Optional<float> VectorCost =
>         C.first.getValue().map([i](InstructionCost::CostType Cost) {
>           return static_cast<float>(Cost) / i;
>         });
> 
>     // removed debug output noise
> 
>     if (auto MinCostVal = MinCost.getValue())
>       if (VectorCost && *VectorCost < *MinCostVal) {
>         Width = i;
>         // requires ctor that takes an Optional, and ctors that convert from number like things
>         MinCost = VectorCost;
>       }
>   }
> ```
> 
> I haven't ran the tests, but this compiles and I think it should work.
Hi @ctetreau, your suggestions for adding a conversion constructor/getter in general seem sensible to me, however I just have one question. In that example above it looks like you're rounding a float (VectorCost) to an integer, which is a change in behaviour I think? Currently the examples I'm thinking of are where we compare a cost of 11 for a width of 2, with a cost of 21 for a width of 4, i.e.
  float(11)/2 = 5.5
  float(21)/4 = 5.25
Previously we'd choose a width of 4 because 5.25 is less than 5.5. I think with your proposal we'd keep a width of 2 because 5.5 gets rounded down to 5. Unless we make MinCost use floats too, in which case the need for conversion goes away I think?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92178/new/

https://reviews.llvm.org/D92178