[llvm] [AMDGPU][GFX12] Restrict scalar subword loads to PAL (PR #117576)
Juan Manuel Martinez CaamaƱo via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 25 09:08:20 PST 2024
jmmartinez wrote:
> > But in this case, it is guaranteed that the buffers stride/num-records are aligned to 4.
>
> How is this guaranteed? This also doesn't seem like a property that should be baked into the platform
Thanks for asking this, I never had to deal with an issue like this one.
In the ticket, the proposed workaround consists of 3 parts:
A) The driver always rounds buffer sizes to a multiple of 4 bytes (this applies to Vulkan)
B) The buffer is a strided (structured) buffer whose stride is known to be a multiple of 4 bytes (this can apply to DX12 structured buffers)
C) Fall back to using buffer_load_[iu]{8,16}
So I assume that the vulkan and DX driver would conform to this.
I'm not happy with using the platform for this, but I wasn't sure if adding another option / another function attribute was the way to go.
https://github.com/llvm/llvm-project/pull/117576
More information about the llvm-commits
mailing list