[llvm] [AArch64][SVE] Don't require 16-byte aligned SVE loads/stores with +strict-align (PR #119732)

Fri Dec 13 02:51:12 PST 2024

================
@@ -2569,6 +2569,17 @@ MVT AArch64TargetLowering::getScalarShiftAmountTy(const DataLayout &DL,
 bool AArch64TargetLowering::allowsMisalignedMemoryAccesses(
     EVT VT, unsigned AddrSpace, Align Alignment, MachineMemOperand::Flags Flags,
     unsigned *Fast) const {
+
+  // Allow SVE loads/stores where the alignment >= the size of the element type,
+  // even with +strict-align. The SVE loads/stores do not require memory to be
+  // aligned more than the element type even without unaligned accesses.
+  // Without this, already aligned loads and stores are forced to have 16-byte
+  // alignment, which is unnecessary and fails to build as
+  // TLI.expandUnalignedLoad() and TLI.expandUnalignedStore() don't yet support
+  // scalable vectors.
+  if (VT.isScalableVector() && Alignment >= Align(VT.getScalarSizeInBits() / 8))
----------------
sdesmalen-arm wrote:

`Align(VT.getScalarSizeInBits() / 8)` will fail an assert when VT < `MVT::i8` (like a predicate `MVT::i1`), so this would fail when the +strict-align feature is not set. Could you add a test for this case?

Not caused by your patch, but I am surprised to see LLVM actually generate regular loads/stores when the alignment is smaller than the element size, e.g. `load <4 x i32>, ptr %ptr, align 1` when the `+strict-align` flag is not set at all. Is this a bug?

https://github.com/llvm/llvm-project/pull/119732