[all-commits] [llvm/llvm-project] 9ceea0: [mlir] `im2col` & `l2cache` on cp.async.bulk.tenso...

Wed Nov 22 07:08:23 PST 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9ceea088592ebaafed8eaccfc236a0ec284655ce
      https://github.com/llvm/llvm-project/commit/9ceea088592ebaafed8eaccfc236a0ec284655ce
  Author: Guray Ozen <guray.ozen at gmail.com>
  Date:   2023-11-22 (Wed, 22 Nov 2023)

  Changed paths:
    M mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
    M mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
    M mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
    M mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir
    M mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
    M mlir/test/Dialect/LLVMIR/invalid.mlir

  Log Message:
  -----------
  [mlir] `im2col` & `l2cache` on cp.async.bulk.tensor.shared.cluster.global` (#72967)

PR adds support of `im2col` and `l2cache` to
`cp.async.bulk.tensor.shared.cluster.global`. The Op is now supports all
the traits of the corresponding PTX instruction.

The current structure of this operation looks somewhat like below. The
PR also simplifies types so we don't need to write obvious types after
`:` anymore.
```
nvvm.cp.async.bulk.tensor.shared.cluster.global
		%dest, %tmaDescriptor, %barrier,
		box[%crd0,%crd1,%crd2,%crd3,%crd4]
		im2col[%off0,%off1,%off2] 			<-- PR introduces
		multicast_mask = %ctamask
		l2_cache_hint = %cacheHint			<-- PR introduces
		: !llvm.ptr<3>, !llvm.ptr
```