[llvm] [ARM64EC] Fix thunks for vector args (PR #96003)
Daniel Paoliello via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 20 10:37:22 PDT 2024
================
@@ -487,6 +487,109 @@ define void @cxx_method(ptr noundef nonnull align 8 dereferenceable(8) %0, ptr d
ret void
}
+define <4 x i8> @small_vector(<4 x i8> %0) {
+; CHECK-LABEL: .def $ientry_thunk$cdecl$m$m;
+; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m$m
+; CHECK: // %bb.0:
+; CHECK-NEXT: sub sp, sp, #192
+; CHECK-NEXT: .seh_stackalloc 192
+; CHECK-NEXT: stp q6, q7, [sp, #16] // 32-byte Folded Spill
+; CHECK-NEXT: .seh_save_any_reg_p q6, 16
+; CHECK-NEXT: stp q8, q9, [sp, #48] // 32-byte Folded Spill
+; CHECK-NEXT: .seh_save_any_reg_p q8, 48
+; CHECK-NEXT: stp q10, q11, [sp, #80] // 32-byte Folded Spill
+; CHECK-NEXT: .seh_save_any_reg_p q10, 80
+; CHECK-NEXT: stp q12, q13, [sp, #112] // 32-byte Folded Spill
+; CHECK-NEXT: .seh_save_any_reg_p q12, 112
+; CHECK-NEXT: stp q14, q15, [sp, #144] // 32-byte Folded Spill
+; CHECK-NEXT: .seh_save_any_reg_p q14, 144
+; CHECK-NEXT: stp x29, x30, [sp, #176] // 16-byte Folded Spill
+; CHECK-NEXT: .seh_save_fplr 176
+; CHECK-NEXT: add x29, sp, #176
+; CHECK-NEXT: .seh_add_fp 176
+; CHECK-NEXT: .seh_endprologue
+; CHECK-NEXT: str w0, [sp, #12]
+; CHECK-NEXT: ldr s0, [sp, #12]
+; CHECK-NEXT: ushll v0.8h, v0.8b, #0
----------------
dpaoliello wrote:
I believe so: it looks like LLVM stores `<4 x i8>` as a tightly packed `i32` in `xmm0`, so this code is widening that arg before calling the ARM64 code and then packing it again afterwards.
https://github.com/llvm/llvm-project/pull/96003
More information about the llvm-commits
mailing list