[Mlir-commits] [mlir] [mlir] `im2col` & `l2cache` on cp.async.bulk.tensor.shared.cluster.global` (PR #72967)
Guray Ozen
llvmlistbot at llvm.org
Tue Nov 21 02:29:35 PST 2023
================
@@ -1404,20 +1404,34 @@ def NVVM_CpAsyncBulkTensorGlobalToSharedClusterOp :
AttrSizedOperandSegments]>,
Arguments<(ins LLVM_PointerShared:$dstMem,
LLVM_AnyPointer:$tmaDescriptor,
- LLVM_PointerShared:$mbar,
- Optional<I16>:$multicastMask,
Variadic<I32>:$coordinates,
+ LLVM_PointerShared:$mbar,
+ Variadic<I16>:$im2colOffsets,
----------------
grypp wrote:
Yes you are right, `im2colOffset` is optional. We have `Variadic<I16>` that can be 0 sized.
For example the following Op is valid and executes in `tiled` mode:
```
nvvm.cp.async.bulk.tensor.shared.cluster.global %0, %1, %2,
box[%d0,%d1,%d2] : !llvm.ptr<3>, !llvm.ptr
```
`im2col` mode is activated when offsets are present like below:
```
nvvm.cp.async.bulk.tensor.shared.cluster.global %0, %1, %2,
box[%d0,%d1,%d2] im2col[%off0]: !llvm.ptr<3>, !llvm.ptr
```
https://github.com/llvm/llvm-project/pull/72967
More information about the Mlir-commits
mailing list