[PATCH] D155343: MachineSink: Fix sinking VGPR def out of a divergent loop

Ruiling, Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 18 09:00:05 PDT 2023


ruiling added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/machine-sink-loop-var-out-of-divergent-loop-swdev407790.mir:54
   ; CHECK-NEXT:   SI_END_CF [[SI_IF_BREAK]], implicit-def dead $exec, implicit-def dead $scc, implicit $exec
-  ; CHECK-NEXT:   [[V_ADD_U32_e64_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
   ; CHECK-NEXT:   INLINEASM &"", 1 /* sideeffect attdialect */, implicit [[V_ADD_U32_e64_]]
----------------
arsenm wrote:
> ruiling wrote:
> > Sorry I don't see why we are not allowed to sink such kind of loop-invariant v_add out of the loop. For this specific case, the result vgpr should be the same with and without the change, right?
> Maybe we could sink it if we introduced a new block before the reconvergence at si_end_cf.
> 
> This testcase has 2 loops, nested with unrelated conditions. By sinking here, it's sinking from the body of the first loop into the second loop. It now executes for lanes which would not have taken the inner loop
I am not sure whether I read the CFG correctly. The bb.4 here is inside the inner loop, and the bb.5 is the reconvergence point for the inner loop. So all the lanes that are active after the SI_END_CF in bb.5 (restore the exec as before entering inner loop) should have been active in the inner loop (bb4).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155343/new/

https://reviews.llvm.org/D155343



More information about the llvm-commits mailing list