[PATCH] D133408: [AArch64] Use misaligned load/store to optimize memory access with non-power2 integer types.

Wed Sep 7 21:43:08 PDT 2022

bcl5980 added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:800
+      if (TLI.allowMisalignedMemForNonPow2Type(SrcVT, LD->getAddressSpace(),
+                                               ExtraLoadAlign, MMOFlags)) {
+        IncSizeBits = ExtraWidth;
----------------
efriedma wrote:
> The alignment you're passing in here doesn't seem right; you want to pass in the alignment of the load you're planning to generate, right?
Ah, yeah. I missed it. Thanks for the catch.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:2045

+bool AArch64TargetLowering::allowMisalignedMemForNonPow2Type(
+    EVT VT, unsigned AddrSpace, Align Alignment,
----------------
efriedma wrote:
> None of this logic looks like it's target-specific; can we just do this in target-independent code?
Yes. Of course, we can move this code into target-independent layer.
But there are some regressions on other platform like x86, comes from load/store pair.
For now, I haven't enabled the optimization for store. So, load pattern is different from store if we have a load/store pair.
For example,

```
define void @i56_or(ptr %a) {
  %aa = load i56, ptr %a, align 1
  %b = or i56 %aa, 384
  store i56 %b, ptr %a, align 1
  ret void
}
```
Actually, this is a general issue but the gain from load can cover the extra instructions on ARM but x86 can't. So, I limited this to ARM.
Maybe we can enable the optimization for store also to fix the issue. But I still have a little concern on it.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133408/new/

https://reviews.llvm.org/D133408