[PATCH] D82788: AMDGPU: Fix alignment requirements for 96bit and 128bit local loads and stores
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 29 15:32:33 PDT 2020
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1375
+ return AlignedBy4;
+ } else if (Size == 96) {
+ // ds_read/write_b96 require 16-byte alignment. gfx9 and onward has
----------------
No else after return
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1377
+ // ds_read/write_b96 require 16-byte alignment. gfx9 and onward has
+ // unaligned access support but windows likes to keep it dword aligned.
+ bool IsGFX9Plus = Subtarget->getGeneration() >= AMDGPUSubtarget::GFX9;
----------------
This is a windows driver bug and doesn't deserve mentioning here. We do not know what the host OS is
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1378-1379
+ // unaligned access support but windows likes to keep it dword aligned.
+ bool IsGFX9Plus = Subtarget->getGeneration() >= AMDGPUSubtarget::GFX9;
+ bool Aligned = Align % (IsGFX9Plus ? 4 : 16) == 0;
+ if (IsFast)
----------------
This should use an explicit subtarget feature check, not for the generation. The subtarget feature also theoretically captures this being a setting. We have hasUnalignedBufferAccess already (I believe the same control is used for DS and regular memory instructions, but I'm not sure)
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1384
+ return Aligned;
+ } else if (Size == 128) {
+ // ds_read/write_b128 require 16-byte alignment, but we can do a 8 byte
----------------
No else after return
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D82788/new/
https://reviews.llvm.org/D82788
More information about the llvm-commits
mailing list