[PATCH] D81638: AMDGPU/GlobalISel: Fix 96 and 128 local loads and stores

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 16 10:27:11 PDT 2020


arsenm added a comment.

In D81638#2096030 <https://reviews.llvm.org/D81638#2096030>, @mbrkusanin wrote:

> Looking at ISA .pdf docs for SI (gfx6) and onward I have not found any requirements for alignments on local loads and stores. There are mentions of dword alignment for reads and writes of dword and larger for buffer instructions but nothing more specific for LDS or GDS. SDag likes to break down ds_read/write_b128 in certain cases but does not know about b96. It seems to me that the code was not updated since SI.
>  Now b96 and b128 will be picked for align 4 and larger (align 2 and 1 are broken down same way as before). Furthermore, there are several Vulkan conformance tests that have align 4 loads and stores (96 and 128) that will now pass.


They definitely have some alignment requirements, but it's poorly documented. I'm pretty confident b128 requires 16-byte alignment, and b64 requires 8. I think gfx9 added unaligned access support (dependent on some config registers), but I think we never fully handled all of these changes. I think the linux driver hardcodes this to allow unaligned access.
The code was updated since SI, but probably not since 96-bit types were added to MVT.

> Alternative solution would be to break them down to ds_read/write_b64/b32 when we're not sure. But I don't have a test to check if this is necessary.
> 
> Since all changes are in .td files this should not cause you any problems with rewriting load/store legality rules.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81638/new/

https://reviews.llvm.org/D81638





More information about the llvm-commits mailing list