[PATCH] D65580: [ARM] Tighten up VLDRH.32 with low alignments

Wed Aug 7 02:08:34 PDT 2019

simon_tatham added inline comments.

================
Comment at: llvm/test/CodeGen/Thumb2/mve-ldst-offset.ll:756
+; CHECK-NEXT:    .pad #8
+; CHECK-NEXT:    sub sp, #8
+; CHECK-NEXT:    ldr.w r3, [r0, #7]
----------------
samparker wrote:
> I am so confused by this, can you explain it for me please?
(Drive-by comment since this crossed my inbox)

I think what's going on here is:

`VLDRH.S32` means: load 8 bytes of memory, regard them as 4 16-bit halfwords (`H`), and sign-extend each one into a 32-bit lane (`S32`) of the output vector register.

But it requires alignment of at least 2 on the memory it's loading from. So in order to apply it to 8 bytes starting at an odd address, the generated code is copying the 8 source bytes to an aligned 8-byte stack slot, and then pointing the `VLDRH.S32` at that instead.

I assume this run of `llc` is in a mode where it assumes unaligned access support on the ordinary `LDR` instruction has been enabled in the hardware configuration. (If I remember, that's the default – to generate code compatible with a CPU that has that turned _off_ you have to say `-mno-unaligned-access` in clang, or whatever llc's equivalent option is.)

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65580/new/

https://reviews.llvm.org/D65580