[PATCH] D146523: [AMDGPU]: Add new intrinsic llvm.amdgcn.convergent.copy

Tue Mar 21 09:44:01 PDT 2023

pravinjagtap added a comment.

In D146523#4209889 <https://reviews.llvm.org/D146523#4209889>, @foad wrote:

> In D146523#4209735 <https://reviews.llvm.org/D146523#4209735>, @pravinjagtap wrote:
>
>> In D146523#4209684 <https://reviews.llvm.org/D146523#4209684>, @foad wrote:
>>
>>> What does this do and what is it for?
>>
>> This convergent copy intrinsic will acts here as a form of barrier which makes sure that all the active lanes  of VGPR (i.e. result of intrinsic) is computed before its use.
>
> Can you give an example?

Consider following input llvm IR:

  %sub.i = sub nsw i32 0, %11
  %12 = atomicrmw add ptr addrspace(1) %1, i32 %sub.i syncscope("agent-one-as") monotonic, align 4

Here, I am transforming computation of  `%12` into a `for` loop which will executed by only a first active lane. One of the Basic Blocks associated with this for loop needs to read values from a VGPR (use) corresponding to `%sub.i`.

So with the convergent copy I want to make sure that:

1. All lanes VGPR associated with `%sub.i`  are computed before starting the loop. #

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146523/new/

https://reviews.llvm.org/D146523