[llvm] [AArch64] Prevent unnecessary truncation in bool vector reduce code generation (PR #120096)
Eli Friedman via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 17 15:52:26 PST 2024
================
@@ -174,12 +174,12 @@ define i64 @extract_last_i64(<2 x i64> %data, <2 x i64> %mask, i64 %passthru) {
; SVE-FIXED-NEXT: sub sp, sp, #16
; SVE-FIXED-NEXT: .cfi_def_cfa_offset 16
; SVE-FIXED-NEXT: cmtst v1.2d, v1.2d, v1.2d
-; SVE-FIXED-NEXT: index z2.s, #0, #1
+; SVE-FIXED-NEXT: index z3.s, #0, #1
; SVE-FIXED-NEXT: mov x9, sp
; SVE-FIXED-NEXT: str q0, [sp]
-; SVE-FIXED-NEXT: xtn v1.2s, v1.2d
-; SVE-FIXED-NEXT: and v2.8b, v1.8b, v2.8b
-; SVE-FIXED-NEXT: umaxp v1.2s, v1.2s, v1.2s
+; SVE-FIXED-NEXT: xtn v2.2s, v1.2d
+; SVE-FIXED-NEXT: umaxv s1, v1.4s
----------------
efriedma-quic wrote:
Oh, I see, we got lucky with instruction reuse before... and if we don't get the reuse, S-form umaxv is faster than xtn+umaxp? In that case, I guess this change is okay.
https://github.com/llvm/llvm-project/pull/120096
More information about the llvm-commits
mailing list