[all-commits] [llvm/llvm-project] b4b819: [MLIR][NVVM] Add Op for TMA Store with reduction (...
Durgadoss R via All-commits
all-commits at lists.llvm.org
Wed Dec 11 06:33:12 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: b4b819ce98f1d77d29ec492f0230018fd633a117
https://github.com/llvm/llvm-project/commit/b4b819ce98f1d77d29ec492f0230018fd633a117
Author: Durgadoss R <durgadossr at nvidia.com>
Date: 2024-12-11 (Wed, 11 Dec 2024)
Changed paths:
M mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
M mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
A mlir/test/Target/LLVMIR/nvvm/tma_store_reduce.mlir
M mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
Log Message:
-----------
[MLIR][NVVM] Add Op for TMA Store with reduction (#118853)
PR #116854 adds intrinsics for TMA Store with reduction.
This patch adds an NVVM Dialect Op for the same.
* Lit tests are added to verify the lowering to LLVM intrinsics and
invalid cases.
* The common verifier method is updated to handle im2col modes without
offsets.
This helps Ops like TMA Store, TMA StoreReduce etc.
* The nvvmir.mlir test file is already large. So, this patch adds the
tests for this Op
in a new file under a separate "nvvm/" directory.
[mlir/test/Target/LLVMIR/"nvvm"/tma_store_reduce.mlir]
PTX Spec reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-reduce-async-bulk-tensor
Signed-off-by: Durgadoss R <durgadossr at nvidia.com>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list