[Mlir-commits] [mlir] [nvgpu][mlir] Fix wgmma store offset (PR #154581)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Wed Aug 20 10:35:16 PDT 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-mlir-gpu

Author: lonely eagle (linuxlonelyeagle)

<details>
<summary>Changes</summary>

When M is greater than 64 in wgmma, the accumulator type should consist of multiple struct types. Calculating the offset of memref on the row based on the number of elements in the struct type is incorrect. For fp16, this offset is 64.

---
Full diff: https://github.com/llvm/llvm-project/pull/154581.diff


1 Files Affected:

- (modified) mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp (+1-2) 


``````````diff
diff --git a/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp b/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
index c6c5ab356f256..fffcb2aedafee 100644
--- a/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
+++ b/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
@@ -1623,11 +1623,10 @@ struct NVGPUWarpgroupMmaStoreOpLowering
     Value matriDValue = adaptor.getMatrixD();
     auto stype = cast<LLVM::LLVMStructType>(matriDValue.getType());
     for (auto [idx, matrixD] : llvm::enumerate(stype.getBody())) {
-      auto structType = cast<LLVM::LLVMStructType>(matrixD);
       Value innerStructValue =
           LLVM::ExtractValueOp::create(b, matriDValue, idx);
       storeFragmentedMatrix(b, innerStructValue, op.getDstMemref(), offset);
-      offset += structType.getBody().size();
+      offset += kWgmmaSizeM;
     }
     rewriter.eraseOp(op);
     return success();

``````````

</details>


https://github.com/llvm/llvm-project/pull/154581


More information about the Mlir-commits mailing list