[all-commits] [llvm/llvm-project] 2b983a: [MLIR][AMDGPU] Adding dynamic size check to avoid ...
Zhuoran Yin via All-commits
all-commits at lists.llvm.org
Tue Apr 15 13:36:46 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 2b983a24583dd4e131d727717872a56712b5dd52
https://github.com/llvm/llvm-project/commit/2b983a24583dd4e131d727717872a56712b5dd52
Author: Zhuoran Yin <zhuoryin at amd.com>
Date: 2025-04-15 (Tue, 15 Apr 2025)
Changed paths:
M mlir/include/mlir/Dialect/AMDGPU/Transforms/Passes.td
M mlir/lib/Dialect/AMDGPU/Transforms/CMakeLists.txt
M mlir/lib/Dialect/AMDGPU/Transforms/TransferReadToLoad.cpp
M mlir/test/Dialect/AMDGPU/transfer-read-to-load.mlir
M utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
Log Message:
-----------
[MLIR][AMDGPU] Adding dynamic size check to avoid subword buffer load (#135014)
Motivation: amdgpu buffer load instruction will return all zeros when
loading sub-word values. For example, assuming the buffer size is
exactly one word and we attempt to invoke
`llvm.amdgcn.raw.ptr.buffer.load.v2i32` starting from byte 2 of the
word, we will not receive the actual value of the buffer but all zeros
for the first word. This is because the boundary has been crossed for
the first word.
This PR come up with a fix to this problem, such that, it creates a
bounds check against the buffer load instruction. It will compare the
offset + vector size to see if the upper bound of the address will
exceed the buffer size. If it does, masked transfer read will be
optimized to `vector.load` + `arith.select`, else, it will continue to
fall back to default lowering of the masked vector load.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list