[llvm] [AMDGPU] Add another SIFoldOperands instance after shrink (PR #67878)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 2 04:07:10 PDT 2023
================
@@ -312,13 +312,13 @@ define <2 x i32> @v_srem_v2i32_pow2k_denom(<2 x i32> %num) {
; GISEL-NEXT: v_sub_i32_e32 v0, vcc, v0, v4
; GISEL-NEXT: v_sub_i32_e32 v1, vcc, v1, v3
; GISEL-NEXT: v_subrev_i32_e32 v3, vcc, s4, v0
-; GISEL-NEXT: v_subrev_i32_e32 v4, vcc, s4, v1
+; GISEL-NEXT: v_subrev_i32_e32 v4, vcc, 0x1000, v1
----------------
jayfoad wrote:
I think the root cause is that some shrinking only happens post-RA (e.g. shrinking V_SUBREV_CO_U32_e64 only if VCC was allocated for the carry-out operand) which is too late to do any more folding.
And the reason it only happens for _some_ SUBREV instructions is even more convoluted. It's because SIFoldOperands will sometimes shrink V_SUB_CO_U32_**e64** to V_SUBREV_CO_U32_**e32** even it does not manage to fold anything into it. This does seem wrong and is probably worth a closer look.
https://github.com/llvm/llvm-project/pull/67878
More information about the llvm-commits
mailing list