[llvm] [SelectionDAG] Use unaligned store to move AVX registers onto stack for `extractelement` (PR #78422)

David Green via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 29 08:32:47 PST 2024


================
@@ -437,8 +442,10 @@ define <2 x i64> @extract_fixed_v2i64_nxv2i64(<vscale x 2 x i64> %vec) nounwind
 ; CHECK-NEXT:    str x29, [sp, #-16]! // 8-byte Folded Spill
 ; CHECK-NEXT:    addvl sp, sp, #-1
 ; CHECK-NEXT:    ptrue p0.d
-; CHECK-NEXT:    st1d { z0.d }, p0, [sp]
-; CHECK-NEXT:    ldr q0, [sp, #16]
+; CHECK-NEXT:    mov z2.d, z0.d
+; CHECK-NEXT:    ldr q1, [sp, #16]
+; CHECK-NEXT:    mov v0.16b, v1.16b
+; CHECK-NEXT:    st1d { z2.d }, p0, [sp]
----------------
davemgreen wrote:

Sorry - I didn't see the ping. Looking at this now I don't think the code is correct. We could certainly be doing better, but in going via the stack like this the store needs to happen before the load.

It looks like the stack object gets created with a non-scalable size (only of size Bytes.getKnownMinValue()), so the MMO is only of size `s128` (not `vscale * s128`), so it thinks the load and store do not alias.

Everything in MachineFrameInfo looks like it still uses int64_t for the object sizes, not TypeSize or LocationSize. A short-term fix might be to check if VecVT is scalable and use an unknown size if so.

https://github.com/llvm/llvm-project/pull/78422


More information about the llvm-commits mailing list