[PATCH] D155343: MachineSink: Fix sinking VGPR def out of a divergent loop
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 18 08:31:27 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/machine-sink-loop-var-out-of-divergent-loop-swdev407790.mir:54
; CHECK-NEXT: SI_END_CF [[SI_IF_BREAK]], implicit-def dead $exec, implicit-def dead $scc, implicit $exec
- ; CHECK-NEXT: [[V_ADD_U32_e64_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; CHECK-NEXT: INLINEASM &"", 1 /* sideeffect attdialect */, implicit [[V_ADD_U32_e64_]]
----------------
ruiling wrote:
> Sorry I don't see why we are not allowed to sink such kind of loop-invariant v_add out of the loop. For this specific case, the result vgpr should be the same with and without the change, right?
Maybe we could sink it if we introduced a new block before the reconvergence at si_end_cf.
This testcase has 2 loops, nested with unrelated conditions. By sinking here, it's sinking from the body of the first loop into the second loop. It now executes for lanes which would not have taken the inner loop
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D155343/new/
https://reviews.llvm.org/D155343
More information about the llvm-commits
mailing list