[PATCH] D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy
Pravin Jagtap via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 21 09:44:01 PDT 2023
pravinjagtap added a comment.
In D146523#4209889 <https://reviews.llvm.org/D146523#4209889>, @foad wrote:
> In D146523#4209735 <https://reviews.llvm.org/D146523#4209735>, @pravinjagtap wrote:
>
>> In D146523#4209684 <https://reviews.llvm.org/D146523#4209684>, @foad wrote:
>>
>>> What does this do and what is it for?
>>
>> This convergent copy intrinsic will acts here as a form of barrier which makes sure that all the active lanes of VGPR (i.e. result of intrinsic) is computed before its use.
>
> Can you give an example?
Consider following input llvm IR:
%sub.i = sub nsw i32 0, %11
%12 = atomicrmw add ptr addrspace(1) %1, i32 %sub.i syncscope("agent-one-as") monotonic, align 4
Here, I am transforming computation of `%12` into a `for` loop which will executed by only a first active lane. One of the Basic Blocks associated with this for loop needs to read values from a VGPR (use) corresponding to `%sub.i`.
So with the convergent copy I want to make sure that:
1. All lanes VGPR associated with `%sub.i` are computed before starting the loop. #
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D146523/new/
https://reviews.llvm.org/D146523
More information about the llvm-commits
mailing list