[PATCH] D102834: [SLPVectorizer] Implement initial memory versioning.

Wed Aug 18 12:50:27 PDT 2021

fhahn updated this revision to Diff 367309.
fhahn added a comment.

Rebased and improved runtime check generation: 1) do not generated checks between 2 read-only groups and 2) skip overflow checks because we know that the last element is dereferenced (Alive2 agrees but I will double check if that is intentional).

In D102834#2941662 <https://reviews.llvm.org/D102834#2941662>, @SjoerdMeijer wrote:

> In D102834#2940056 <https://reviews.llvm.org/D102834#2940056>, @fhahn wrote:
>
>> In D102834#2919138 <https://reviews.llvm.org/D102834#2919138>, @SjoerdMeijer wrote:
>>
>>> Sorry for the late reply, but just wanted to confirm this: yes, that `@f_alias` in `../AArch64/loadi8.ll` is a reproducer from x264.
>>
>> Great thanks. The code in the test should be transformed now. If you point me to the C code, I can check if it is transformed as expected now.
>
> I believe that was function `mc_weight_w20()`.

Hm interesting. The latest version should vectorize `mc_weight_w20` on AArch64. But there's no measurable speedup from that unfortunately on the hardware I have access to. I also cannot measure any speedup if I add `restrict` to the `mc_weight_w20` arguments, which should cause SLP vectorization without runtime checks. Is it possible I am missing something?

> And like I mentioned earlier, I expect this change to make quite some differences for x264. For example, I hope it will trigger on `quant_4x4()` too, although additional trick may be required for successful SLP vectorisation of that example (cost-model and or other things).

I am seeing 5-10% speedups when vectorizing `quant_4x4`. After runtime check generation, there's still another issue though (conditional load that's not hoisted out).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102834/new/

https://reviews.llvm.org/D102834

Files:
  llvm/include/llvm/Transforms/Vectorize/SLPVectorizer.h
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/test/Transforms/SLPVectorizer/AArch64/loadi8.ll
  llvm/test/Transforms/SLPVectorizer/AArch64/memory-runtime-checks-in-loops.ll
  llvm/test/Transforms/SLPVectorizer/AArch64/memory-runtime-checks.ll
  llvm/test/Transforms/SLPVectorizer/X86/memory-runtime-checks.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D102834.367309.patch
Type: text/x-patch
Size: 111733 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210818/2715b3c6/attachment-0001.bin>