[PATCH] D74858: [AArch64][SVE] Add intrinsics for non-temporal gather-loads/scatter-stores

Wed Feb 26 05:49:52 PST 2020

andwar marked an inline comment as done.
andwar added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12622
+  // order, we may need to swap them to match the spec.
+  if (Opcode == AArch64ISD::SSTNT1 && Offset.getValueType().isVector())
+      std::swap(Base, Offset);
----------------
sdesmalen wrote:
> Have you experimented moving this code out of this function (same for the `SST1_IMM` case below), and pass in the Base and Offset (and possibly Chain/PG) as operands to `performScatterStoreCombine`. In the case-statement for `aarch64_sve_st1_scatter_scalar_offset` and `aarch64_sve_stnt1_scatter_scalar_offset `  you can than do the swap. That seems a bit better than special handling these cases in this combine itself.
Yes, but currently:
* it is very explicit that `AArch64ISD::SSTNT1` requires some special treatment, which IMO is a bit counterintuitive (hopefully the comments make it clear!)
* I'm only passing one argument (`*N`), instead of 3 (`*N`, `SDValue`, `SDValue`) to `performScatterStoreCombine` (not counting the other arguments), so the call-site is cleaner if we keep things as they are.

Also, `SST1_IMM` requires 2 conditions to be checked and the opcode to be updated (and there are 2 possibilities here, either `SST1_UXTW` or `SST1`). Swapping `Base` and `Offset` when calling `performScatterStoreCombine` wouldn't be enough to replicate this. 

Having said that, I've been looking at `performScatterStoreCombine`/ `performGatherLoadCombine` for a while now and I wouldn't be surprised if I'm over-engineering this :)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74858/new/

https://reviews.llvm.org/D74858