[Mlir-commits] [mlir] [mlir][nvgpu] update commit group and wait async ops (PR #130482)
lonely eagle
llvmlistbot at llvm.org
Sun Mar 16 03:50:52 PDT 2025
linuxlonelyeagle wrote:
The original implementation using SSA Value should be traceable from here. This example contains the original original usage. (It should be understood that the wait is for a cp operation, not for groups that are being committed.
https://mlir.llvm.org/docs/Dialects/NVGPU/#nvgpudevice_async_copy-nvgpudeviceasynccopyop
```
// copy 1.
%cp1 = nvgpu.device_async_copy %A[%c0], %B[%c0], 4 :memref<16xf32> to memref<16xf32, 3>
// copy 2.
%cp2 = nvgpu.device_async_copy %C[%c0], %D[%c0], 4 : memref<16xf32> to memref<16xf32, 3>
// group 1 contains copy 1 and copy 2.
%token1 = nvgpu.device_async_create_group %cp1, %cp2
// copy 3.
%cp3 = nvgpu.device_async_copy %E[%c0], %F[%c0], 4 : memref<16xf32> to memref<16xf32, 3>
// group 2 contains copy 3.
%token2 = nvgpu.device_async_create_group %cp3
// after the wait copy 1 and copy 2 are complete.
nvgpu.device_async_wait %token1
// after the wait copy 3 is complete.
nvgpu.device_async_wait %token2
Example:
%0 = nvgpu.device_async_copy %src[%c0, %c0], %dst[%c0, %c0, %c0], 4 :
memref<4x5xf32> to memref<2x7x5xf32, 3>
```
https://github.com/llvm/llvm-project/pull/130482
More information about the Mlir-commits
mailing list