[all-commits] [llvm/llvm-project] 18e161: [MLIR][NVVM] Introduction of the `wgmma.mma_async` Op
Guray Ozen via All-commits
all-commits at lists.llvm.org
Wed Aug 9 14:08:16 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 18e161f9e15b036faf48bfd8813d9330e06e2ee3
https://github.com/llvm/llvm-project/commit/18e161f9e15b036faf48bfd8813d9330e06e2ee3
Author: Guray Ozen <guray.ozen at gmail.com>
Date: 2023-08-09 (Wed, 09 Aug 2023)
Changed paths:
M mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
M mlir/lib/Conversion/NVVMToLLVM/NVVMToLLVM.cpp
M mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
A mlir/test/Conversion/NVVMToLLVM/invalid.mlir
M mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
Log Message:
-----------
[MLIR][NVVM] Introduction of the `wgmma.mma_async` Op
This work introduces the `wgmma.mma_async` Op along PTX generation using `BasicPtxBuilderOpInterface`. The Op is designed to execute the matrix multiply-and-accumulate operation across a warpgroup (128 threads). It's important to note that this operation works for devices with the sm_90a capability.
The matrix multiply-and-accumulate operation can take one of the following forms. In both cases, matrix D is referred to as the accumulator:
D = A * B + D : Result is added to the accumulator matrix D.
D = A * B : The input from the accumulator matrix D is not utilized.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D157370
More information about the All-commits
mailing list