[PATCH] D118185: [AMDGPU] Mark v_cmp* convergent

Ruiling, Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 26 04:25:28 PST 2022


ruiling added a comment.

In D118185#3271905 <https://reviews.llvm.org/D118185#3271905>, @foad wrote:

>> I am not sure this is a good idea. Actually an ordinary V_CMP is still sinkable. It is only the V_CMP lowered from llvm.amdgcn.icmp that cannot be sinked.
>
> Right, an ordinary V_CMP is still sinkable because of the way the result is used, there is an implicit guarantee that bits in the result corresponding to inactive lanes (at the point of use) will be ignored. So marking *all* V_CMPs convergent is overkill, although it does not seem to have any bad effects on the test suite. On the other hand, duplicating all the V_CMP instructions (so we have separate convergent and non-convergent versions of them) doesn't seem like a good idea, because there are lots of them.

If duplicating V_CMP for convergent and non-convergent version would make code over-complex, I would go for this simple solution. We can re-visit this issue when we really hit some performance issue here. Let's see if others have different opinion.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118185/new/

https://reviews.llvm.org/D118185



More information about the llvm-commits mailing list