[all-commits] [llvm/llvm-project] 9ceea0: [mlir] `im2col` & `l2cache` on cp.async.bulk.tenso...

Guray Ozen via All-commits all-commits at lists.llvm.org
Wed Nov 22 07:08:23 PST 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9ceea088592ebaafed8eaccfc236a0ec284655ce
      https://github.com/llvm/llvm-project/commit/9ceea088592ebaafed8eaccfc236a0ec284655ce
  Author: Guray Ozen <guray.ozen at gmail.com>
  Date:   2023-11-22 (Wed, 22 Nov 2023)

  Changed paths:
    M mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
    M mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
    M mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
    M mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir
    M mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
    M mlir/test/Dialect/LLVMIR/invalid.mlir

  Log Message:
  -----------
  [mlir] `im2col` & `l2cache` on cp.async.bulk.tensor.shared.cluster.global` (#72967)

PR adds support of `im2col` and `l2cache` to
`cp.async.bulk.tensor.shared.cluster.global`. The Op is now supports all
the traits of the corresponding PTX instruction.

The current structure of this operation looks somewhat like below. The
PR also simplifies types so we don't need to write obvious types after
`:` anymore.
```
nvvm.cp.async.bulk.tensor.shared.cluster.global
		%dest, %tmaDescriptor, %barrier,
		box[%crd0,%crd1,%crd2,%crd3,%crd4]
		im2col[%off0,%off1,%off2] 			<-- PR introduces
		multicast_mask = %ctamask
		l2_cache_hint = %cacheHint			<-- PR introduces
		: !llvm.ptr<3>, !llvm.ptr
```




More information about the All-commits mailing list