[PATCH] D59149: [LV] move useEmulatedMaskMemRefHack() functionality to TTI.

Tue Mar 19 10:06:30 PDT 2019

hsaito added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5488

-    if (useEmulatedMaskMemRefHack(I))
+    if (int HackedCost = TTI.getEmulatedMaskMemRefCost(
+            I, NumPredStores, NumberOfStoresToPredicate))
----------------
rengolin wrote:
> markus wrote:
> > rengolin wrote:
> > > hsaito wrote:
> > > > rengolin wrote:
> > > > > This if doesn't need the Cost above, so you can avoid the division by moving it up to the beginning of the block.
> > > > > 
> > > > > Nit: you can just return `HackedCost` instead of assigning, but that will depend on the new pattern.
> > > > This is actually a good question.
> > > > 
> > > > Maybe, we want to structure this TTI interface as the adjustment to the base cost computed above.
> > > > In which case, the code would look like
> > > >    Cost += TTI.getEmulatedMaskMemRefAdjustment(...)
> > > > 
> > > > Does this make more sense?
> > > > 
> > > > Even then, I still agree that we can return early like
> > > > if (!isPredicaredInst(I))
> > > >      return Cost;
> > > > Cost /= getReciprocalPredBlockProb();
> > > > return Cost + TTI.getEmulatedMaskMemRefAdjustment(...)
> > > > 
> > > > Does this sound better?
> > > If the prohibitive cost is large enough, then adding up will make no difference, and you can early return on either Cost or Adjusted.
> > > 
> > > If the cost isn't meant as just "big enough" but to (later) actually emulate the masking costs, then it makes sense to add like you propose on the comment above.
> > > 
> > > Right now, the case looks like the former, but if you're planning it to be more like the latter, than the new proposal makes sense.
> > I agree that this should be changed to
> > ```
> > Cost += TTI.getEmulatedMaskMemRefAdjustment(...)
> > ```
> > (or equivalent with the early return).
> IFF the adjustment has any meaning other than "too large a number", I agree, too.
The reason I changed here to this style is that the person starting to work on the real adjustment doesn't have to change the structure here. If you think this suggested "stylistic" change requires at least one target doing a "reasonable" adjustment (or committed to start such a study), I could do that for AVX2 target ---- come up with a smaller number that doesn't degrade some (undisclosed) set of benchmarks that I can run quickly enough.  Does this sound reasonable?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59149/new/

https://reviews.llvm.org/D59149