[all-commits] [llvm/llvm-project] 38d9a4: [MLIR][NVGPU] Add `tma.fence.descriptor` OP (#133218)

Thu Mar 27 07:20:41 PDT 2025

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 38d9a445106cba09854d9d00050a20f6faa4dd0b
      https://github.com/llvm/llvm-project/commit/38d9a445106cba09854d9d00050a20f6faa4dd0b
  Author: Guray Ozen <guray.ozen at gmail.com>
  Date:   2025-03-27 (Thu, 27 Mar 2025)

  Changed paths:
    M mlir/include/mlir/Dialect/NVGPU/IR/NVGPUOps.td
    M mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
    M mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir

  Log Message:
  -----------
  [MLIR][NVGPU] Add `tma.fence.descriptor` OP (#133218)

When the TMA descriptor is transferred from host memory to global memory
using cudaMemcpy, each thread block must insert a fence before any
thread accesses the updated tensor map in global memory. Once the tensor
map has been accessed, no additional fences are needed by that block
unless the map is modified again.

[Example from cuda programming
guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#using-tma-to-transfer-multi-dimensional-arrays).
The `tma.fence.descriptor` basically implements
`ptx::fence_proxy_tensormap_generic`.
```
#include <cuda.h>
#include <cuda/ptx>
namespace ptx = cuda::ptx;

__device__ CUtensorMap global_tensor_map;
__global__ void kernel(CUtensorMap *tensor_map)
{
  // Fence acquire tensor map:
  ptx::n32_t<128> size_bytes;
  // Since the tensor map was modified from the host using cudaMemcpy,
  // the scope should be .sys.
  ptx::fence_proxy_tensormap_generic(
     ptx::sem_acquire, ptx::scope_sys, tensor_map, size_bytes
 );
 // Safe to use tensor_map after fence inside this thread..
}
int main() {
  CUtensorMap local_tensor_map;
  // [ ..Initialize map.. ]
  cudaMemcpy(&global_tensor_map, &local_tensor_map, sizeof(CUtensorMap), cudaMemcpyHostToDevice);
  kernel<<<1, 1>>>(global_tensor_map);
}
```

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications