[Mlir-commits] [mlir] [mlir][nvgpu] Fix the TMA stride setup (PR #75838)
Adam Paszke
llvmlistbot at llvm.org
Mon Dec 18 10:12:00 PST 2023
https://github.com/apaszke created https://github.com/llvm/llvm-project/pull/75838
There were two issues with the previous computation:
* it never looked at dimensions past the second one
* the definition was recursive, making each dimension have an extra `elementSize` power
>From 0e3bcb16d4df06fe6603635c749187d1d4c032c8 Mon Sep 17 00:00:00 2001
From: Adam Paszke <apaszke at google.com>
Date: Mon, 18 Dec 2023 18:07:55 +0000
Subject: [PATCH] [mlir][nvgpu] Fix the TMA stride setup
There were two issues with the previous computation:
* it never looked at dimensions past the second one
* the definition was recursive, making each dimension have an extra
`elementSize` power
---
mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp b/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
index 5ec87d58cc57f8..c45320a674568a 100644
--- a/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
+++ b/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
@@ -487,8 +487,7 @@ extern "C" MLIR_CUDA_WRAPPERS_EXPORT void *mgpuTensorMapEncodeTiledMemref(
globalStrides[0] = globalDim[0] * elementSizeInBytes[tensorDataType];
for (int r = 1; r < tensorRank - 1; r++)
- globalStrides[r] = globalStrides[r - 1] * globalDim[1] *
- elementSizeInBytes[tensorDataType];
+ globalStrides[r] = globalStrides[r - 1] * globalDim[r];
ScopedContext scopedContext;
mgpuTensorMapEncodeTiled(&tensorMap, tensorDataType, tensorRank32,
More information about the Mlir-commits
mailing list