[llvm] [CodeGen] Implement widening for partial.reduce.add (PR #161834)

Sander de Smalen via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 8 08:03:48 PDT 2025


================
@@ -0,0 +1,98 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc < %s | FileCheck %s
+
+target triple = "aarch64"
+
+define void @partial_reduce_widen_v1i32_acc_v16i32_vec(ptr %accptr, ptr %resptr, ptr %vecptr) {
+; CHECK-LABEL: partial_reduce_widen_v1i32_acc_v16i32_vec:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    ldp q1, q0, [x2]
+; CHECK-NEXT:    ldr s2, [x0]
+; CHECK-NEXT:    ldp q5, q6, [x2, #32]
+; CHECK-NEXT:    ext v3.16b, v1.16b, v1.16b, #8
+; CHECK-NEXT:    ext v4.16b, v0.16b, v0.16b, #8
+; CHECK-NEXT:    add v1.2s, v2.2s, v1.2s
+; CHECK-NEXT:    ext v2.16b, v5.16b, v5.16b, #8
+; CHECK-NEXT:    add v0.2s, v1.2s, v0.2s
+; CHECK-NEXT:    add v1.2s, v4.2s, v3.2s
+; CHECK-NEXT:    ext v3.16b, v6.16b, v6.16b, #8
+; CHECK-NEXT:    add v0.2s, v0.2s, v5.2s
+; CHECK-NEXT:    add v1.2s, v2.2s, v1.2s
+; CHECK-NEXT:    add v0.2s, v0.2s, v6.2s
+; CHECK-NEXT:    add v1.2s, v3.2s, v1.2s
+; CHECK-NEXT:    add v0.2s, v1.2s, v0.2s
+; CHECK-NEXT:    dup v1.2s, v0.s[1]
+; CHECK-NEXT:    add v0.2s, v0.2s, v1.2s
+; CHECK-NEXT:    str s0, [x1]
+; CHECK-NEXT:    ret
+  %acc = load <1 x i32>, ptr %accptr
+  %vec = load <16 x i32>, ptr %vecptr
----------------
sdesmalen-arm wrote:

The reason I didn't do that was so that I wouldn't have to pass/return illegal types to the function (the ABI only describes how legal types are passed)

https://github.com/llvm/llvm-project/pull/161834


More information about the llvm-commits mailing list