[PATCH] D120370: [AMDGPU] Fix combined MMO in load-store merge
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 22 16:39:27 PST 2022
rampitec created this revision.
rampitec added a reviewer: foad.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
rampitec requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.
Loads and stores can be out of order in the SILoadStoreOptimizer.
When combining MachineMemOperands of two instructions operands are
sent in the IR order into the combineKnownAdjacentMMOs. At the
moment it picks the first operand and just replaces its offset and
size. This essentially loses alignment information and may generally
result in an incorrect base pointer to be used.
Use a base pointer in memory addresses order instead and only adjust
size.
https://reviews.llvm.org/D120370
Files:
llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
llvm/test/CodeGen/AMDGPU/merge-global-load-store.mir
Index: llvm/test/CodeGen/AMDGPU/merge-global-load-store.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/merge-global-load-store.mir
+++ llvm/test/CodeGen/AMDGPU/merge-global-load-store.mir
@@ -405,7 +405,7 @@
; GCN-LABEL: name: merge_global_load_dword_2_out_of_order
; GCN: [[DEF:%[0-9]+]]:vreg_64_align2 = IMPLICIT_DEF
- ; GCN-NEXT: [[GLOBAL_LOAD_DWORDX2_:%[0-9]+]]:vreg_64_align2 = GLOBAL_LOAD_DWORDX2 [[DEF]], 0, 0, implicit $exec :: (load (s64) from `i32 addrspace(1)* undef`, addrspace 1)
+ ; GCN-NEXT: [[GLOBAL_LOAD_DWORDX2_:%[0-9]+]]:vreg_64_align2 = GLOBAL_LOAD_DWORDX2 [[DEF]], 0, 0, implicit $exec :: (load (s64) from `float addrspace(1)* undef`, align 4, addrspace 1)
; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX2_]].sub1
; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY killed [[GLOBAL_LOAD_DWORDX2_]].sub0
; GCN-NEXT: S_NOP 0, implicit [[COPY]], implicit [[COPY1]]
@@ -422,7 +422,7 @@
; GCN-LABEL: name: merge_global_load_dword_3_out_of_order
; GCN: [[DEF:%[0-9]+]]:vreg_64_align2 = IMPLICIT_DEF
- ; GCN-NEXT: [[GLOBAL_LOAD_DWORDX3_:%[0-9]+]]:vreg_96_align2 = GLOBAL_LOAD_DWORDX3 [[DEF]], 0, 0, implicit $exec :: (load (s96) from `i32 addrspace(1)* undef`, align 4, addrspace 1)
+ ; GCN-NEXT: [[GLOBAL_LOAD_DWORDX3_:%[0-9]+]]:vreg_96_align2 = GLOBAL_LOAD_DWORDX3 [[DEF]], 0, 0, implicit $exec :: (load (s96) from `float addrspace(1)* undef`, align 16, addrspace 1)
; GCN-NEXT: [[COPY:%[0-9]+]]:vreg_64_align2 = COPY [[GLOBAL_LOAD_DWORDX3_]].sub0_sub1
; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY killed [[GLOBAL_LOAD_DWORDX3_]].sub2
; GCN-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
Index: llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
+++ llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
@@ -662,19 +662,15 @@
return true;
}
-// This function assumes that \p A and \p B have are identical except for
+// This function assumes that \p A and \p B are identical except for
// size and offset, and they reference adjacent memory.
static MachineMemOperand *combineKnownAdjacentMMOs(MachineFunction &MF,
const MachineMemOperand *A,
const MachineMemOperand *B) {
- unsigned MinOffset = std::min(A->getOffset(), B->getOffset());
unsigned Size = A->getSize() + B->getSize();
- // This function adds the offset parameter to the existing offset for A,
- // so we pass 0 here as the offset and then manually set it to the correct
- // value after the call.
- MachineMemOperand *MMO = MF.getMachineMemOperand(A, 0, Size);
- MMO->setOffset(MinOffset);
- return MMO;
+ if (A->getOffset() > B->getOffset())
+ A = B;
+ return MF.getMachineMemOperand(A, A->getPointerInfo(), Size);
}
bool SILoadStoreOptimizer::dmasksCanBeCombined(const CombineInfo &CI,
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D120370.410675.patch
Type: text/x-patch
Size: 3046 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220223/9dfd3171/attachment.bin>
More information about the llvm-commits
mailing list