[llvm] [AArch64][SVE] Fold integer lane extract and store to FPR store (PR #129756)

Mon Mar 17 04:41:00 PDT 2025

================
@@ -269,23 +269,27 @@ define <4 x i256> @load_sext_v4i32i256(ptr %ap) {
 ; CHECK-NEXT:    ext z0.b, z0.b, z0.b, #8
 ; CHECK-NEXT:    sunpklo z0.d, z0.s
 ; CHECK-NEXT:    fmov x9, d1
-; CHECK-NEXT:    mov z1.d, z1.d[1]
-; CHECK-NEXT:    fmov x11, d0
-; CHECK-NEXT:    mov z0.d, z0.d[1]
-; CHECK-NEXT:    asr x10, x9, #63
-; CHECK-NEXT:    stp x9, x10, [x8]
+; CHECK-NEXT:    mov z2.d, z1.d[1]
----------------
MacDue wrote:

I've added a heuristic "If there are other users of integer scalars from this vector that won't be folded into a store -- don't fold". This resolves this case and the Neon regressions (so this is now enabled for SVE and Neon).

Note: Just `hasOneUse()` alone was not enough to prevent regressions (since this fold can extend vector lifetimes, and disrupt paired stores if generally applied).  

https://github.com/llvm/llvm-project/pull/129756