[PATCH] D149281: Not disable loop unroll for vectorized loops on AMDGPU target

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 8 18:14:18 PDT 2023


alex-t added a comment.

In D149281#4322079 <https://reviews.llvm.org/D149281#4322079>, @nikic wrote:

> 

The question is why the vectorizer failed to unroll the loop in your workload.

It did not fail in fact. The "unrolling via interleaving" was deliberately disabled for the AMDGPU target since it led to the uncontrolled RP increase.
The corresponding change was addressed in https://reviews.llvm.org/D122850.

For CPU you decide about the interleave count by subtracting the loop invariants number from the number of the available registers and dividing the result by the RP for the given class. This allows us to estimate the number of computation flows that may run simultaneously.

For GPU, which is natively a SIMT machine this estimation on the high level merely does not make sense.
The LoopUnroll is controllable and lets us reasonably trade-off between the unroll size and the RP.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149281/new/

https://reviews.llvm.org/D149281



More information about the llvm-commits mailing list