[PATCH] D127317: [AArch64][SME] Add ldr/str (fill/spill) intrinsics

Wed Mar 29 09:47:47 PDT 2023

bryanpkc added inline comments.

================
Comment at: llvm/test/CodeGen/AArch64/SME/sme-intrinsics-stores.ll:289
+; CHECK-NEXT:    addvl x8, x0, #16
+; CHECK-NEXT:    str za[w12, 0], [x8]
+; CHECK-NEXT:    ret
----------------
sdesmalen wrote:
> bryanpkc wrote:
> > @david-arm How does the VL multiplier in the address operand affect the vector offset? Could you explain why `za[w12, 0]` through `za[w12, 15]` are selected for VL multipliers 0-15, but for all multipliers larger than 16, only `za[w12, 0]` is selected (`w12` remains zero in all cases)? I understand that ACLE intrinsics won't actually generate slice offsets larger than 15, but the behaviour of the LLVM intrinsics is confusing (as they do not accept an explicit slice offset argument).
> I'm not sure if this answers your question, but it seemed we had a fix for the stores that we hadn't pushed upstream yet. It avoids creating e.g. `str za[w12, 0], [x0, #15, mul vl]`, since the immediates must match up. The fix can be found here: D147136.
Thanks! I noticed that discrepancy between `LDR` and `STR` later yesterday too. But how the LLVM intrinsics are designed to work is still not clear to me. Determining the slice offset based on the addend in the address operand, and resetting the slice offset to 0 if the addend is larger than 256 bytes, seems rather arbitrary. Am I missing something?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127317/new/

https://reviews.llvm.org/D127317