[llvm] [mlir] [MLIR][AMDGPU] Adding dynamic size check to avoid subword buffer load (PR #135014)

Wed Apr 9 12:24:19 PDT 2025

krzysz00 wrote:

To summarize discussions elsewhere:

1. The condition that forces us down the slow path is, with `delta` being `bufferSize - linearizedOffset`, that `delta < numElements && delta % (32 ceilDiv elementBitwidth) != 0` (so trivially, for 32-bit and larger quantities, we never go down the slow path
2. We need a warning on these patterns that they're currently not known to work in the presence of negative offsets because the bounds check does a saturating addition, not a wrapping one, so code could end up in the fast path incorrectly

https://github.com/llvm/llvm-project/pull/135014