[llvm] [ARM] Stop gluing ALU nodes to branches / selects (PR #116970)

Sergei Barannikov via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 20 07:31:13 PST 2024


================
@@ -865,45 +869,49 @@ define arm_aapcs_vfpcc <8 x half> @fptosi_v8i1_v8f16(<8 x half> %src) {
 ; CHECK-MVE-NEXT:    vcvt.s32.f16 s4, s0
 ; CHECK-MVE-NEXT:    vmovx.f16 s0, s0
 ; CHECK-MVE-NEXT:    vcvt.s32.f16 s0, s0
+; CHECK-MVE-NEXT:    vmov r0, s4
+; CHECK-MVE-NEXT:    vmov r1, s0
 ; CHECK-MVE-NEXT:    vldr.16 s8, .LCPI25_0
-; CHECK-MVE-NEXT:    vmov r0, s0
 ; CHECK-MVE-NEXT:    vmov.f16 s6, #1.000000e+00
-; CHECK-MVE-NEXT:    lsls r0, r0, #31
-; CHECK-MVE-NEXT:    vmov r0, s4
-; CHECK-MVE-NEXT:    vseleq.f16 s10, s8, s6
-; CHECK-MVE-NEXT:    vcvt.s32.f16 s4, s1
-; CHECK-MVE-NEXT:    lsls r0, r0, #31
-; CHECK-MVE-NEXT:    vseleq.f16 s0, s8, s6
-; CHECK-MVE-NEXT:    vins.f16 s0, s10
 ; CHECK-MVE-NEXT:    vmovx.f16 s10, s1
 ; CHECK-MVE-NEXT:    vcvt.s32.f16 s10, s10
-; CHECK-MVE-NEXT:    vmov r0, s10
 ; CHECK-MVE-NEXT:    lsls r0, r0, #31
+; CHECK-MVE-NEXT:    lsls r1, r1, #31
+; CHECK-MVE-NEXT:    vseleq.f16 s4, s8, s6
+; CHECK-MVE-NEXT:    cmp r0, #0
+; CHECK-MVE-NEXT:    vseleq.f16 s0, s8, s6
+; CHECK-MVE-NEXT:    vmov r1, s10
+; CHECK-MVE-NEXT:    vins.f16 s0, s4
+; CHECK-MVE-NEXT:    vcvt.s32.f16 s4, s1
 ; CHECK-MVE-NEXT:    vmov r0, s4
-; CHECK-MVE-NEXT:    vcvt.s32.f16 s4, s2
-; CHECK-MVE-NEXT:    vmovx.f16 s2, s2
-; CHECK-MVE-NEXT:    vseleq.f16 s10, s8, s6
-; CHECK-MVE-NEXT:    vcvt.s32.f16 s2, s2
+; CHECK-MVE-NEXT:    vmovx.f16 s10, s3
+; CHECK-MVE-NEXT:    vcvt.s32.f16 s10, s10
+; CHECK-MVE-NEXT:    lsls r1, r1, #31
+; CHECK-MVE-NEXT:    vseleq.f16 s4, s8, s6
 ; CHECK-MVE-NEXT:    lsls r0, r0, #31
-; CHECK-MVE-NEXT:    vmov r0, s2
+; CHECK-MVE-NEXT:    cmp r0, #0
 ; CHECK-MVE-NEXT:    vseleq.f16 s1, s8, s6
-; CHECK-MVE-NEXT:    vins.f16 s1, s10
-; CHECK-MVE-NEXT:    lsls r0, r0, #31
+; CHECK-MVE-NEXT:    vins.f16 s1, s4
+; CHECK-MVE-NEXT:    vcvt.s32.f16 s4, s2
+; CHECK-MVE-NEXT:    vmovx.f16 s2, s2
 ; CHECK-MVE-NEXT:    vmov r0, s4
-; CHECK-MVE-NEXT:    vseleq.f16 s10, s8, s6
-; CHECK-MVE-NEXT:    vcvt.s32.f16 s4, s3
+; CHECK-MVE-NEXT:    vcvt.s32.f16 s2, s2
+; CHECK-MVE-NEXT:    vmov r1, s2
 ; CHECK-MVE-NEXT:    lsls r0, r0, #31
+; CHECK-MVE-NEXT:    lsls r1, r1, #31
+; CHECK-MVE-NEXT:    vseleq.f16 s4, s8, s6
+; CHECK-MVE-NEXT:    cmp r0, #0
----------------
s-barannikov wrote:

The reason is that the nodes are scheduled a little differently, which `ARMBaseInstrInfo::optimizeCompareInstr` is unable to handle. I think this can be fixed, but should be done separately. Given the overall improvement this PR provides, I suppose it's okay to do this later?


https://github.com/llvm/llvm-project/pull/116970


More information about the llvm-commits mailing list