[PATCH] D155343: MachineSink: Fix sinking VGPR def out of a divergent loop

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 18 08:31:27 PDT 2023

arsenm added inline comments.

Comment at: llvm/test/CodeGen/AMDGPU/machine-sink-loop-var-out-of-divergent-loop-swdev407790.mir:54
   ; CHECK-NEXT:   SI_END_CF [[SI_IF_BREAK]], implicit-def dead $exec, implicit-def dead $scc, implicit $exec
-  ; CHECK-NEXT:   [[V_ADD_U32_e64_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
   ; CHECK-NEXT:   INLINEASM &"", 1 /* sideeffect attdialect */, implicit [[V_ADD_U32_e64_]]
ruiling wrote:
> Sorry I don't see why we are not allowed to sink such kind of loop-invariant v_add out of the loop. For this specific case, the result vgpr should be the same with and without the change, right?
Maybe we could sink it if we introduced a new block before the reconvergence at si_end_cf.

This testcase has 2 loops, nested with unrelated conditions. By sinking here, it's sinking from the body of the first loop into the second loop. It now executes for lanes which would not have taken the inner loop



More information about the llvm-commits mailing list