[llvm] [AMDGPU] Add another SIFoldOperands instance after shrink (PR #67878)

Mon Oct 2 09:50:53 PDT 2023

================
@@ -312,13 +312,13 @@ define <2 x i32> @v_srem_v2i32_pow2k_denom(<2 x i32> %num) {
 ; GISEL-NEXT:    v_sub_i32_e32 v0, vcc, v0, v4
 ; GISEL-NEXT:    v_sub_i32_e32 v1, vcc, v1, v3
 ; GISEL-NEXT:    v_subrev_i32_e32 v3, vcc, s4, v0
-; GISEL-NEXT:    v_subrev_i32_e32 v4, vcc, s4, v1
+; GISEL-NEXT:    v_subrev_i32_e32 v4, vcc, 0x1000, v1
----------------
rampitec wrote:

This seems to be a deficiency in the shrink pass itself. There were 4 subtractions using this register before the shrink pass, 2 of them were already V_SUBREV_CO_U32_e32, and two were V_SUB_CO_U32_e64:

```
  %47:sreg_32 = S_MOV_B32 4096
  %62:vgpr_32, dead %95:sreg_64 = V_SUB_CO_U32_e64 %59:vgpr_32, %47:sreg_32, 0, implicit $exec
  %65:vgpr_32, dead %94:sreg_64 = V_SUB_CO_U32_e64 %63:vgpr_32, %47:sreg_32, 0, implicit $exec
  %37:vgpr_32 = V_SUBREV_CO_U32_e32 %47:sreg_32, %34:vgpr_32, implicit-def $vcc, implicit $exec
  %40:vgpr_32 = V_SUBREV_CO_U32_e32 %47:sreg_32, %38:vgpr_32, implicit-def $vcc, implicit $exec
```
The shrink pass did not touch last 2, which are already VOP2, but commuted 2 first:

```
  %62:vgpr_32, dead %95:sreg_64 = V_SUBREV_CO_U32_e64 %47:sreg_32, %59:vgpr_32, 0, implicit $exec
  %65:vgpr_32, dead %94:sreg_64 = V_SUBREV_CO_U32_e64 %47:sreg_32, %63:vgpr_32, 0, implicit $exec
```
Although after commute it did not continue with shrinking of the commuted instructions. It was only shrunk much later, when shrink pass was running after RA.

I believe this is because of this code in the shrink pass:

```
      if (!TII->canShrink(MI, *MRI)) {
        // Try commuting the instruction and see if that enables us to shrink
        // it.
        if (!MI.isCommutable() || !TII->commuteInstruction(MI) ||
            !TII->canShrink(MI, *MRI)) {
          tryReplaceDeadSDST(MI);
          continue;
        }
      }
```
The continue here will pick the next instruction and does not try to shrink newly commuted one.

https://github.com/llvm/llvm-project/pull/67878