[llvm] [AArch64][SVE] Don't require 16-byte aligned SVE loads/stores with +strict-align (PR #119732)

Fri Dec 13 02:51:13 PST 2024

================
@@ -2569,6 +2569,17 @@ MVT AArch64TargetLowering::getScalarShiftAmountTy(const DataLayout &DL,
 bool AArch64TargetLowering::allowsMisalignedMemoryAccesses(
     EVT VT, unsigned AddrSpace, Align Alignment, MachineMemOperand::Flags Flags,
     unsigned *Fast) const {
+
+  // Allow SVE loads/stores where the alignment >= the size of the element type,
+  // even with +strict-align. The SVE loads/stores do not require memory to be
+  // aligned more than the element type even without unaligned accesses.
+  // Without this, already aligned loads and stores are forced to have 16-byte
+  // alignment, which is unnecessary and fails to build as
+  // TLI.expandUnalignedLoad() and TLI.expandUnalignedStore() don't yet support
+  // scalable vectors.
----------------
sdesmalen-arm wrote:

nit on the comment; this is true for SVE's ld1/st1 instruction, but not for SVE str/ldr as those require the address to be 16-byte aligned (for data vectors, and 2-byte aligned for predicate vectors). So there is an assumption here that a store of `<vscale x 4 x i32>` ends up using `st1`, which is true in practice if the store comes from the IR.

https://github.com/llvm/llvm-project/pull/119732