[PATCH] D82788: AMDGPU: Fix alignment requirements for 96bit and 128bit local loads and stores

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 29 15:32:33 PDT 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1375
+      return AlignedBy4;
+    } else if (Size == 96) {
+      // ds_read/write_b96 require 16-byte alignment. gfx9 and onward has
----------------
No else after return


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1377
+      // ds_read/write_b96 require 16-byte alignment. gfx9 and onward has
+      // unaligned access support but windows likes to keep it dword aligned.
+      bool IsGFX9Plus = Subtarget->getGeneration() >= AMDGPUSubtarget::GFX9;
----------------
This is a windows driver bug and doesn't deserve mentioning here. We do not know what the host OS is


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1378-1379
+      // unaligned access support but windows likes to keep it dword aligned.
+      bool IsGFX9Plus = Subtarget->getGeneration() >= AMDGPUSubtarget::GFX9;
+      bool Aligned = Align % (IsGFX9Plus ? 4 : 16) == 0;
+      if (IsFast)
----------------
This should use an explicit subtarget feature check, not for the generation. The subtarget feature also theoretically captures this being a setting. We have hasUnalignedBufferAccess already (I believe the same control is used for DS and regular memory instructions, but I'm not sure)


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1384
+      return Aligned;
+    } else if (Size == 128) {
+      // ds_read/write_b128 require 16-byte alignment, but we can do a 8 byte
----------------
No else after return


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82788/new/

https://reviews.llvm.org/D82788





More information about the llvm-commits mailing list