[PATCH] D65580: [ARM] Tighten up VLDRH.32 with low alignments

Wed Aug 7 02:28:21 PDT 2019

samparker added inline comments.

================
Comment at: llvm/test/CodeGen/Thumb2/mve-ldst-offset.ll:756
+; CHECK-NEXT:    .pad #8
+; CHECK-NEXT:    sub sp, #8
+; CHECK-NEXT:    ldr.w r3, [r0, #7]
----------------
simon_tatham wrote:
> samparker wrote:
> > I am so confused by this, can you explain it for me please?
> (Drive-by comment since this crossed my inbox)
> 
> I think what's going on here is:
> 
> `VLDRH.S32` means: load 8 bytes of memory, regard them as 4 16-bit halfwords (`H`), and sign-extend each one into a 32-bit lane (`S32`) of the output vector register.
> 
> But it requires alignment of at least 2 on the memory it's loading from. So in order to apply it to 8 bytes starting at an odd address, the generated code is copying the 8 source bytes to an aligned 8-byte stack slot, and then pointing the `VLDRH.S32` at that instead.
> 
> I assume this run of `llc` is in a mode where it assumes unaligned access support on the ordinary `LDR` instruction has been enabled in the hardware configuration. (If I remember, that's the default – to generate code compatible with a CPU that has that turned _off_ you have to say `-mno-unaligned-access` in clang, or whatever llc's equivalent option is.)
Bah, thanks! For some reason I wasn't thinking about the need to widen, all the loads really threw me.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65580/new/

https://reviews.llvm.org/D65580