[PATCH] D73880: AMDGPU/GlobalISel: Fix move s.buffer.load to VALU
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 5 08:40:34 PST 2020
nhaehnle added a comment.
Thank you for doing this. A couple of concerns...
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:1375-1377
+ // Use the alignment to ensure that the required offsets will fit into the
+ // immediate offsets.
+ const unsigned Align = NumLoads > 1 ? 16 * NumLoads : 4;
----------------
What's the justification for the alignment in the `NumLoads > 1` case?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:1437-1448
+ if (RSrcBank != &AMDGPU::SGPRRegBank) {
+ // Remove the original instruction to avoid potentially confusing the
+ // waterfall loop logic.
+ B.setInstr(*Span.begin());
+ MI.eraseFromParent();
+
+ SmallSet<Register, 4> OpsToWaterfall;
----------------
This case should waterfall the original load if only the srsrc is a VGPR....
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn-s-buffer-load.mir:79
; CHECK: [[BUILD_VECTOR:%[0-9]+]]:sgpr(<4 x s32>) = G_BUILD_VECTOR [[V_READFIRSTLANE_B32_]](s32), [[V_READFIRSTLANE_B32_1]](s32), [[V_READFIRSTLANE_B32_2]](s32), [[V_READFIRSTLANE_B32_3]](s32)
- ; CHECK: [[INT:%[0-9]+]]:sgpr(<4 x s32>) = G_INTRINSIC intrinsic(@llvm.amdgcn.s.buffer.load), [[BUILD_VECTOR]](<4 x s32>), [[COPY1]](s32), 0
+ ; CHECK: [[AMDGPU_BUFFER_LOAD:%[0-9]+]]:vgpr(<4 x s32>) = G_AMDGPU_BUFFER_LOAD [[BUILD_VECTOR]](<4 x s32>), [[C1]](s32), [[COPY2]], [[C]], 0, 0, 0 :: (dereferenceable invariant load 16, align 4)
; CHECK: [[S_AND_SAVEEXEC_B64_:%[0-9]+]]:sreg_64_xexec = S_AND_SAVEEXEC_B64 killed [[S_AND_B64_]], implicit-def $exec, implicit-def $scc, implicit $exec
----------------
This should still be an s_buffer_load...
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73880/new/
https://reviews.llvm.org/D73880
More information about the llvm-commits
mailing list