[Mlir-commits] [mlir] [mlir] `im2col` & `l2cache` on cp.async.bulk.tensor.shared.cluster.global` (PR #72967)

Guray Ozen llvmlistbot at llvm.org
Tue Nov 21 02:29:35 PST 2023


================
@@ -1404,20 +1404,34 @@ def NVVM_CpAsyncBulkTensorGlobalToSharedClusterOp :
   AttrSizedOperandSegments]>,
   Arguments<(ins  LLVM_PointerShared:$dstMem,
                   LLVM_AnyPointer:$tmaDescriptor,
-                  LLVM_PointerShared:$mbar,
-                  Optional<I16>:$multicastMask,                  
                   Variadic<I32>:$coordinates,
+                  LLVM_PointerShared:$mbar,                  
+                  Variadic<I16>:$im2colOffsets,
----------------
grypp wrote:

Yes you are right, `im2colOffset` is optional. We have `Variadic<I16>` that can be  0 sized. 

For example the following Op is valid and executes in `tiled` mode:
```
nvvm.cp.async.bulk.tensor.shared.cluster.global %0, %1, %2, 
                box[%d0,%d1,%d2] : !llvm.ptr<3>, !llvm.ptr
```
`im2col` mode is activated when offsets are present like below:
```
nvvm.cp.async.bulk.tensor.shared.cluster.global %0, %1, %2, 
                box[%d0,%d1,%d2] im2col[%off0]: !llvm.ptr<3>, !llvm.ptr
```

https://github.com/llvm/llvm-project/pull/72967


More information about the Mlir-commits mailing list