[llvm] [AArch64][CostModel] Lower cost of dupq (SVE2.1) (PR #144918)
Gaƫtan Bossu via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 24 01:19:21 PDT 2025
================
@@ -71,13 +72,43 @@ define void @dupq_f16_256b(ptr %addr) #0 {
}
define void @dupq_bf16_256b(ptr %addr) #0 {
-; CHECK-LABEL: dupq_bf16_256b:
-; CHECK: // %bb.0:
-; CHECK-NEXT: ldp q0, q1, [x0]
-; CHECK-NEXT: dup v0.8h, v0.h[2]
-; CHECK-NEXT: dup v1.8h, v1.h[2]
-; CHECK-NEXT: stp q0, q1, [x0]
-; CHECK-NEXT: ret
+; SVE-LABEL: dupq_bf16_256b:
+; SVE: // %bb.0:
+; SVE-NEXT: ldp q0, q1, [x0]
+; SVE-NEXT: dup v0.8h, v0.h[2]
+; SVE-NEXT: dup v1.8h, v1.h[2]
+; SVE-NEXT: stp q0, q1, [x0]
+; SVE-NEXT: ret
+;
+; SME-LABEL: dupq_bf16_256b:
+; SME: // %bb.0:
+; SME-NEXT: ldp q1, q0, [x0]
+; SME-NEXT: str q0, [sp, #-64]!
+; SME-NEXT: .cfi_def_cfa_offset 64
+; SME-NEXT: ldr h0, [sp, #4]
+; SME-NEXT: str q1, [sp, #32]
+; SME-NEXT: str h0, [sp, #30]
+; SME-NEXT: str h0, [sp, #28]
+; SME-NEXT: str h0, [sp, #26]
+; SME-NEXT: str h0, [sp, #24]
+; SME-NEXT: str h0, [sp, #22]
+; SME-NEXT: str h0, [sp, #20]
+; SME-NEXT: str h0, [sp, #18]
+; SME-NEXT: str h0, [sp, #16]
+; SME-NEXT: ldr h0, [sp, #36]
+; SME-NEXT: ldr q1, [sp, #16]
+; SME-NEXT: str h0, [sp, #62]
+; SME-NEXT: str h0, [sp, #60]
+; SME-NEXT: str h0, [sp, #58]
+; SME-NEXT: str h0, [sp, #56]
+; SME-NEXT: str h0, [sp, #54]
+; SME-NEXT: str h0, [sp, #52]
+; SME-NEXT: str h0, [sp, #50]
+; SME-NEXT: str h0, [sp, #48]
+; SME-NEXT: ldr q0, [sp, #48]
+; SME-NEXT: stp q0, q1, [x0]
+; SME-NEXT: add sp, sp, #64
+; SME-NEXT: ret
----------------
gbossu wrote:
Checking: The use or ld/st is because when the function is in streaming mode, SVE instructions aren't available.
https://github.com/llvm/llvm-project/pull/144918
More information about the llvm-commits
mailing list