[all-commits] [llvm/llvm-project] b4b819: [MLIR][NVVM] Add Op for TMA Store with reduction (...

Durgadoss R via All-commits all-commits at lists.llvm.org
Wed Dec 11 06:33:12 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: b4b819ce98f1d77d29ec492f0230018fd633a117
      https://github.com/llvm/llvm-project/commit/b4b819ce98f1d77d29ec492f0230018fd633a117
  Author: Durgadoss R <durgadossr at nvidia.com>
  Date:   2024-12-11 (Wed, 11 Dec 2024)

  Changed paths:
    M mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
    M mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
    A mlir/test/Target/LLVMIR/nvvm/tma_store_reduce.mlir
    M mlir/test/Target/LLVMIR/nvvmir-invalid.mlir

  Log Message:
  -----------
  [MLIR][NVVM] Add Op for TMA Store with reduction (#118853)

PR #116854 adds intrinsics for TMA Store with reduction.
This patch adds an NVVM Dialect Op for the same.

* Lit tests are added to verify the lowering to LLVM intrinsics and
invalid cases.
* The common verifier method is updated to handle im2col modes without
offsets.
   This helps Ops like TMA Store, TMA StoreReduce etc.
* The nvvmir.mlir test file is already large. So, this patch adds the
tests for this Op
   in a new file under a separate "nvvm/" directory.
   [mlir/test/Target/LLVMIR/"nvvm"/tma_store_reduce.mlir]

PTX Spec reference:

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-reduce-async-bulk-tensor

Signed-off-by: Durgadoss R <durgadossr at nvidia.com>



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list