[PATCH] D59149: [LV] move useEmulatedMaskMemRefHack() functionality to TTI.
Hideki Saito via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 13 12:43:19 PDT 2019
hsaito marked 2 inline comments as done.
hsaito added a comment.
In D59149#1426070 <https://reviews.llvm.org/D59149#1426070>, @markus wrote:
> Any improvement is an improvement so I am happy with that but it is still mentioned that this solution is a hack and I guess the
>
> > Cost model for emulated masked load/store is completely broken.
>
> comment is still valid. What would it take to address this properly?
Each target to run many applications/benchmarks to come up with the "right" adjustment to the cost model. No way around that.
================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:270
+ (isa<StoreInst>(Inst) && NumPredStores > Threshold);
+ return useHackedCost ? 3000000 : 0;
+ }
----------------
rengolin wrote:
> markus wrote:
> > This value seems a bit arbitrary. Could it not be e.g. INT_MAX instead if it is supposed to represent something infinitely expensive?
> Costs will be added to others, INT_MAX would wrap.
I can certainly add a comment saying that this is an arbitrarily chosen "big enough value" --- just need to be sufficiently bigger than a typical instruction cost, which is a single digit or a low double digit, so as to disable most cases of emulated masked memory references.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5488
- if (useEmulatedMaskMemRefHack(I))
+ if (int HackedCost = TTI.getEmulatedMaskMemRefCost(
+ I, NumPredStores, NumberOfStoresToPredicate))
----------------
rengolin wrote:
> This if doesn't need the Cost above, so you can avoid the division by moving it up to the beginning of the block.
>
> Nit: you can just return `HackedCost` instead of assigning, but that will depend on the new pattern.
This is actually a good question.
Maybe, we want to structure this TTI interface as the adjustment to the base cost computed above.
In which case, the code would look like
Cost += TTI.getEmulatedMaskMemRefAdjustment(...)
Does this make more sense?
Even then, I still agree that we can return early like
if (!isPredicaredInst(I))
return Cost;
Cost /= getReciprocalPredBlockProb();
return Cost + TTI.getEmulatedMaskMemRefAdjustment(...)
Does this sound better?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D59149/new/
https://reviews.llvm.org/D59149
More information about the llvm-commits
mailing list