[PATCH] D149281: Not disable loop unroll for vectorized loops on AMDGPU target

Wed May 3 11:35:09 PDT 2023

alex-t added a comment.

In D149281#4299934 <https://reviews.llvm.org/D149281#4299934>, @fhahn wrote:

> In D149281#4299890 <https://reviews.llvm.org/D149281#4299890>, @rampitec wrote:
>
>> Add a test?
>
> That would be helpful. It would be good to understand why runtime unrolling is needed here. Does the interleave heuristic not kick in?

As far as I understand, before https://reviews.llvm.org/D115261 unroll was disabled for the epilogue loops only.
https://reviews.llvm.org/D115261 introduces disabling unroll for any loop which was vectorized.

The regression is caused by the nature of the GPU target architecture. It has a massive parallel HW and very costly branches.
The loop vectorization on a high-level IR yields almost nothing but loop unroll is always crucial for the performance.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149281/new/

https://reviews.llvm.org/D149281