[llvm] [AArch64][SVE] Fold integer lane extract and store to FPR store (PR #129756)
Benjamin Maxwell via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 17 04:41:00 PDT 2025
================
@@ -269,23 +269,27 @@ define <4 x i256> @load_sext_v4i32i256(ptr %ap) {
; CHECK-NEXT: ext z0.b, z0.b, z0.b, #8
; CHECK-NEXT: sunpklo z0.d, z0.s
; CHECK-NEXT: fmov x9, d1
-; CHECK-NEXT: mov z1.d, z1.d[1]
-; CHECK-NEXT: fmov x11, d0
-; CHECK-NEXT: mov z0.d, z0.d[1]
-; CHECK-NEXT: asr x10, x9, #63
-; CHECK-NEXT: stp x9, x10, [x8]
+; CHECK-NEXT: mov z2.d, z1.d[1]
----------------
MacDue wrote:
I've added a heuristic "If there are other users of integer scalars from this vector that won't be folded into a store -- don't fold". This resolves this case and the Neon regressions (so this is now enabled for SVE and Neon).
Note: Just `hasOneUse()` alone was not enough to prevent regressions (since this fold can extend vector lifetimes, and disrupt paired stores if generally applied).
https://github.com/llvm/llvm-project/pull/129756
More information about the llvm-commits
mailing list