[PATCH] D59149: [LV] move useEmulatedMaskMemRefHack() functionality to TTI.

Tue Mar 19 02:02:45 PDT 2019

markus added a comment.

I guess what we want to achieve here (and please correct me if I am wrong) is

- For this first patch to have zero impact on generated code
- Allow targets to to start overriding `getEmulatedMaskMemRefCost` and return some sensible cost for the single operation being queried (and not just `3000000` or `0`)

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5251
+        if (TTI.getEmulatedMaskMemRefCost(&I, NumPredStores,
+                                          NumberOfStoresToPredicate) == 0 &&
             computePredInstDiscount(&I, ScalarCosts, VF) >= 0)
----------------
Would this still work if we allow `getEmulatedMaskMemRefCost` to return an actual cost and not just `3000000` or `0`?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5488

-    if (useEmulatedMaskMemRefHack(I))
+    if (int HackedCost = TTI.getEmulatedMaskMemRefCost(
+            I, NumPredStores, NumberOfStoresToPredicate))
----------------
rengolin wrote:
> hsaito wrote:
> > rengolin wrote:
> > > This if doesn't need the Cost above, so you can avoid the division by moving it up to the beginning of the block.
> > > 
> > > Nit: you can just return `HackedCost` instead of assigning, but that will depend on the new pattern.
> > This is actually a good question.
> > 
> > Maybe, we want to structure this TTI interface as the adjustment to the base cost computed above.
> > In which case, the code would look like
> >    Cost += TTI.getEmulatedMaskMemRefAdjustment(...)
> > 
> > Does this make more sense?
> > 
> > Even then, I still agree that we can return early like
> > if (!isPredicaredInst(I))
> >      return Cost;
> > Cost /= getReciprocalPredBlockProb();
> > return Cost + TTI.getEmulatedMaskMemRefAdjustment(...)
> > 
> > Does this sound better?
> If the prohibitive cost is large enough, then adding up will make no difference, and you can early return on either Cost or Adjusted.
> 
> If the cost isn't meant as just "big enough" but to (later) actually emulate the masking costs, then it makes sense to add like you propose on the comment above.
> 
> Right now, the case looks like the former, but if you're planning it to be more like the latter, than the new proposal makes sense.
I agree that this should be changed to
```
Cost += TTI.getEmulatedMaskMemRefAdjustment(...)
```
(or equivalent with the early return).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59149/new/

https://reviews.llvm.org/D59149