[PATCH] D53160: AMDGPU: Avoid selecting ds_{read,write}2_b32 on SI

Tue Oct 16 11:41:52 PDT 2018

nhaehnle added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:6711
+    if (Subtarget->getGeneration() == AMDGPUSubtarget::SOUTHERN_ISLANDS &&
+        NumElements == 2 && VT.getStoreSize() == 8 &&
+        Store->getAlignment() < 8) {
----------------
nhaehnle wrote:
> arsenm wrote:
> > NumElements == 2 is redundant and possibly wrong?
> I don't know. We shouldn't have unaligned i64 loads at this point, I guess, but the check does ensure that we're really dealing with a vector load. And NumElements > 2 is dealt with above.
And what's more, even if we did have unaligned i64 load/store at this point, it doesn't really make sense to try to fix them. The SI bug only affects the case where the 8 bytes straddle the lower bound of LDS (i.e., vaddr == -4). Trying to load an i64 from there is wrong anyway.

Repository:
  rL LLVM

https://reviews.llvm.org/D53160