[PATCH] D91336: AMDGPU/GlobalISel: Fix negative offset folding for buffer_load

Thu Nov 12 06:32:58 PST 2020

foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUGlobalISelUtils.cpp:16

-std::pair<Register, unsigned>
+std::pair<Register, int64_t>
 AMDGPU::getBaseWithConstantOffset(MachineRegisterInfo &MRI, Register Reg) {
----------------
I am really not sure about this change. I think `Reg` is always a 32-bit value here (perhaps we should assert this), so any G_ADD that we decompose will be a 32-bit add, so using `unsigned` (or perhaps `int`) for the offset makes sense.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:1377
+  MachineInstr *Add = getOpcodeDef(AMDGPU::G_ADD, CombinedOffset, *MRI);
+  if (Add && Offset >= 0) {
     Register Src0 = getSrcRegIgnoringCopies(*MRI, Add->getOperand(1).getReg());
----------------
I think the hardware zero-extends each component of the address from 32 to 64 bits before adding them together. So the question here is: does zext(add(x, y)) == add(zext(x), zext(y))? The answer is "only if add(x, y) has no unsigned overflow".

So I don't think "Offset >= 0" is exactly the right condition here, but it does fix the common case of a small negative immediate offset, so I'm happy with this as a short term fix.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91336/new/

https://reviews.llvm.org/D91336