[llvm] [AArch64] Combine store (trunc X to <3 x i8>) to sequence of ST1.b. (PR #78637)
Florian Hahn via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 19 08:19:43 PST 2024
================
@@ -154,17 +154,12 @@ define <3 x i32> @load_v3i32(ptr %src) {
define void @store_trunc_from_64bits(ptr %src, ptr %dst) {
; CHECK-LABEL: store_trunc_from_64bits:
; CHECK: ; %bb.0: ; %entry
-; CHECK-NEXT: sub sp, sp, #16
-; CHECK-NEXT: .cfi_def_cfa_offset 16
-; CHECK-NEXT: ldr s0, [x0]
-; CHECK-NEXT: ldrh w8, [x0, #4]
-; CHECK-NEXT: mov.h v0[2], w8
-; CHECK-NEXT: xtn.8b v0, v0
-; CHECK-NEXT: str s0, [sp, #12]
-; CHECK-NEXT: ldrh w9, [sp, #12]
-; CHECK-NEXT: strb w8, [x1, #2]
-; CHECK-NEXT: strh w9, [x1]
-; CHECK-NEXT: add sp, sp, #16
+; CHECK-NEXT: add x8, x0, #4
+; CHECK-NEXT: ld1r.4h { v0 }, [x8]
+; CHECK-NEXT: ldr w8, [x0]
----------------
fhahn wrote:
> This isn't really doing a good job of demonstrating what you're trying to do here... maybe add a testcase with some arithmetic, so the store and the load can't be combined together?
The store and load have different addresses, so it shouldn't be possible to combine them. Is there another combining opportunity I am missing?
> Also, this looks like it's getting miscompiled; it's only storing two bytes.
Yes, this was using incorrect extract indices; should be fixed now, thanks!
https://github.com/llvm/llvm-project/pull/78637
More information about the llvm-commits
mailing list