[PATCH] D120370: [AMDGPU] Fix combined MMO in load-store merge

Wed Feb 23 08:44:32 PST 2022

rampitec added a comment.

In D120370#3339919 <https://reviews.llvm.org/D120370#3339919>, @foad wrote:

> MIRParser is a bit strange, because it parses 'align 8' as 'basealign 8'. So your test case:
>
>   %1:vgpr_32 = GLOBAL_LOAD_DWORD %0, 4, 0, implicit $exec :: (load (s32) from `i32 addrspace(1)* undef` + 4, align 8, addrspace 1)
>   %2:vgpr_32 = GLOBAL_LOAD_DWORD %0, 0, 0, implicit $exec :: (load (s32) from `float addrspace(1)* undef`, align 4, addrspace 1)
>
> really means:
>
>   %1:vgpr_32 = GLOBAL_LOAD_DWORD %0, 4, 0, implicit $exec :: (load (s32) from `i32 addrspace(1)* undef` + 4, basealign 8, addrspace 1)
>   %2:vgpr_32 = GLOBAL_LOAD_DWORD %0, 0, 0, implicit $exec :: (load (s32) from `float addrspace(1)* undef`, basealign 4, addrspace 1)
>
> so the two MMOs have the same base pointer, but different information about how aligned it is, which probably would not happen in real codegen.
>
> If you really want to handle that correctly, I suggest taking the larger of the two basealign values, instead of taking the one with the smaller offset.
>
> Separately I think it would be less confusing if the test case said "basealign" explicitly, at least when specifying an alignment that is "more aligned" than the offset.
>
> Separately maybe we should fix the MIRParser?

The thing I have written in the test might be mannered, but this is exactly what I meant for the testing purposes: align is 8, basealign is 4, offset to it is 4, and we somehow managed to know the alignment of the value with offset to be 8. I know it will not happen in real life likely.

Regardless of the MIR parser I suppose we need to use a pointer info from the lead operation except for the size, because final operation will have the same base pointer as the lead.

================
Comment at: llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:665-666

-// This function assumes that \p A and \p B have are identical except for
+// This function assumes that \p A and \p B are identical except for
 // size and offset, and they reference adjacent memory.
 static MachineMemOperand *combineKnownAdjacentMMOs(MachineFunction &MF,
----------------
foad wrote:
> If this comment was true ("A and B are identical except for size and offset") then your patch would not be necessary.
In fact I think this comment is generally incorrect. I'd probably better remove it al all. This is the point of the patch, use the MMO of the leading operation, this is why I have changed one of the pointers in the test to float, to spot that.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120370/new/

https://reviews.llvm.org/D120370