[PATCH] D109368: [LV] Don't vectorize if we can prove RT + vector cost >= scalar cost.

Sat Sep 25 11:54:51 PDT 2021

fhahn updated this revision to Diff 375060.
fhahn added a comment.
Herald added subscribers: bmahjour, dmgreen.

In D109368#2989858 <https://reviews.llvm.org/D109368#2989858>, @ebrevnov wrote:

> Hi Florian,
>
> I do think this change is very important move in the right direction. Thanks for driving it. My main concern is essentially the same as for  D75981 <https://reviews.llvm.org/D75981>. I think such things should be done directly on CostModel. I've uploaded alternative implementation which does the same thing as this patch but in the CostModel. Please take a look and let me know if you like that or not. It would be really helpful to hear from others as well.

Thanks for taking a look!

I made a largish adjustment to the patch. It still uses the formula outlined in the original patch, but uses it differently: instead of computing the costs for known/expected trip counts it now computes the minimum trip count given the costs. This new minimum trip count can than be checked against the known/expected trip count. If there is no known/expected trip count, this minimum trip count is used for the minimum iteration check.

It also computes a second minimum to have a bound on the runtime overhead (if the checks fail) compared to the scalar loop. This should guard against failing runtime checks increasing the total runtime by more than a fraction of the scalar loop. (at the moment the fraction is hardcoded 1/10th of the total scalar loop cost, but I'll add an option for it if  we converge)

In D109368#2997070 <https://reviews.llvm.org/D109368#2997070>, @ebrevnov wrote:

> I realized that doing similar thing in the cost model requires a bit of preparational work (about 6 patches). Due to that I think it may be reasonable to land this first. Please find my comments inlined. One thing I would like to ask you. In order to simplify merging with the mentioned 6 patches would be nice if you rebase your work on D109443 <https://reviews.llvm.org/D109443> (hopefully can be landed quickly) and take D109444 <https://reviews.llvm.org/D109444> (except line 10182) as part of this change.

Given the large update I did not adjust it yet on D109443 <https://reviews.llvm.org/D109443>. I'd suggest to discuss moving the code into the cost-model separately in your patches. I think at the moment the cost-model is mostly concerned with picking the best vectorization factor and focuses on computing costs for different VFs. I am not sure what the trade-offs are to adding more VF independent cost-modeling there yet (drawbacks are adding even more global state to the cost model, increasing complexity). I assume the motivation for moving it to the cost model is to allow using it when deciding on whether to vectorize given ScalarEpilogueStatus? I think that's best discussed separate with focus on that case.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109368/new/

https://reviews.llvm.org/D109368

Files:
  llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Transforms/LoopVectorize/AArch64/runtime-check-size-based-threshold.ll
  llvm/test/Transforms/LoopVectorize/ARM/mve-qabs.ll
  llvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll
  llvm/test/Transforms/LoopVectorize/X86/pointer-runtime-checks-unprofitable.ll
  llvm/test/Transforms/LoopVectorize/X86/pr23997.ll
  llvm/test/Transforms/LoopVectorize/X86/pr35432.ll
  llvm/test/Transforms/LoopVectorize/X86/runtime-limit.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D109368.375060.patch
Type: text/x-patch
Size: 23138 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210925/e1f3f5bc/attachment.bin>