[llvm] [AMDGPU] Add another SIFoldOperands instance after shrink (PR #67878)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 2 04:07:10 PDT 2023


================
@@ -312,13 +312,13 @@ define <2 x i32> @v_srem_v2i32_pow2k_denom(<2 x i32> %num) {
 ; GISEL-NEXT:    v_sub_i32_e32 v0, vcc, v0, v4
 ; GISEL-NEXT:    v_sub_i32_e32 v1, vcc, v1, v3
 ; GISEL-NEXT:    v_subrev_i32_e32 v3, vcc, s4, v0
-; GISEL-NEXT:    v_subrev_i32_e32 v4, vcc, s4, v1
+; GISEL-NEXT:    v_subrev_i32_e32 v4, vcc, 0x1000, v1
----------------
jayfoad wrote:

I think the root cause is that some shrinking only happens post-RA (e.g. shrinking V_SUBREV_CO_U32_e64 only if VCC was allocated for the carry-out operand) which is too late to do any more folding.

And the reason it only happens for _some_ SUBREV instructions is even more convoluted. It's because SIFoldOperands will sometimes shrink V_SUB_CO_U32_**e64** to V_SUBREV_CO_U32_**e32** even it does not manage to fold anything into it. This does seem wrong and is probably worth a closer look.



https://github.com/llvm/llvm-project/pull/67878


More information about the llvm-commits mailing list