[PATCH] D115261: [LV] Disable runtime unrolling for vectorized loops.

Mon Jan 2 10:44:18 PST 2023

lebedev.ri added a comment.

In D115261#4021955 <https://reviews.llvm.org/D115261#4021955>, @fhahn wrote:

> Rebased and adjusted code back to be for runtime-unrolling only for now.
>
> In D115261#3807958 <https://reviews.llvm.org/D115261#3807958>, @xbolva00 wrote:
>
>> Are you gonna land this patch? Or any blockers?
>
> Not any longer with 9758242046b3 <https://reviews.llvm.org/rG9758242046b3cdce6fb713acb6d3f5bfaa933a47> landed.

(not any longer //what//? :)

> In D115261#4001190 <https://reviews.llvm.org/D115261#4001190>, @lebedev.ri wrote:
>
>> Does this disable all unrolling, or only runtime unrolling?
>
> This went through a couple of iterations. Updated the code to limit to runtime unrolling only as the description/title says.
>
>> I strongly suspect that the full unrolling should still be allowed.
>> Consider e.g. `D136806` from the https://godbolt.org/z/fsdMhETh3 from D136806 <https://reviews.llvm.org/D136806>,
>> there we still need to full-unroll, after vectorization, to get the SROA to trigger.
>
> Full unrolling is back to being allowed for this patch.

Okay.

>> I'm also a bit vary to the fact that LV unrolling
>> is functionally different to the normal unrolling/
>
> It is different, but LV's cost-modeling should be more realistic then the one for runtime unrolling. Runtime unrolling can actively be harmful (https://github.com/llvm/llvm-project/issues/40306) after LV and it adds substantial compile-time for little/no gain in the benchmark runs I did. If there are cases where it actively helps, I am happy to analyze those.

I'm talking about the fact that LV unrolling increases vector sizes,
(i mean, unless i'm grossly misremembering things?) while normal
runtime unrolling just executes N vector iterations at once,
allowing for speculative execution. So ignoring the cost model question,
they *are* different.

Look e.g. at the actual codegen for interleaving,
with higher VF's we often start running out of registers and spill,
and by that point any vectorization gain are almost lost,
while if we'd stayed at some smaller VF, we'd be fine.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115261/new/

https://reviews.llvm.org/D115261