[PATCH] D75580: [llvm][CodeGen][SVE] Implement IR intrinsics for gather prefetch.

Tue Mar 10 05:12:15 PDT 2020

andwar added a comment.

@fpetrogalli Thanks for the update - this has evolved into a very neat patch! I've left a few nits, nothing major. One additional nice-to-have: could the commit message contain the list of intrinsics that you are adding? Personally I think that making references to ACLE there is not needed, but I don't mind it being there.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12914

+/// Combine the gather prefetch (scalar + vector addressing mode) when
+/// the offset vector is an unpacked 32-bit scalable vector.
----------------
sdesmalen wrote:
> nit: this comment is a bit confusing. It doesn't so much 'combine' anything, but it rather 'legalizes the offset'.
This method is only needed for unpacked offset, right? Maybe worth stating that for all other cases the input node is already OK? Otherwise I read this as: 
* do something for unpacked 32-bit offsets, in all other cases "I don't know".

 And in reality it's more like: 
* do something for unpacked 32-bit offsets, in all other cases it's already OK

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12920
+
+  // Not an unpacked vector, bail off.
+  if (Offset.getValueType().getSimpleVT().SimpleTy != MVT::nxv2i32)
----------------
`bail off` -> `bail out`

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12924
+
+  // Extend the unpacked offset vector to 64-bit lanes.
+  const SDLoc DL(N);
----------------
[nit] IMO the comment and the code are inconsistent. What about this:
```
  // Extend the unpacked offset vector to 64-bit lanes.
  Offset = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::nxv2i64, Offset);

  const SDLoc DL(N);
  SDValue Ops[] = {
      N->getOperand(0),   // Chain
      N->getOperand(1),   // Intrinsic ID
      N->getOperand(2),   // Base
      Offset,             // Offset
      N->getOperand(4),   // Pg
      N->getOperand(5)    // prfop
  };
```
This way the other comments are not unnecessarily pushed away.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12937
+
+// Check if the value of \p Offset represent a valid immediate for the
+// SVE gather prefetch instruction with vector base and immediate
----------------
[nit] represent -> represents
[nit] Inconsistent comment style - you use `//` here, but you used `///` before. Since these are static methods, I've been using `//` everywhere, but the file is inconsistent in this respect. So choose the one you like best :) 

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12945
+static bool
+isValidOffsetImmediateForSVEGatherPrefetch(SDValue Offset,
+                                           unsigned ScalarSizeInBytes) {
----------------
Very nice and useful method :) Could you please generalise the name and use it in `peformGatherLoadCombine` and `performScatterStoreCombine`:
*  https://github.com/llvm/llvm-project/blob/ff9ac33e1e02a7c637b8cdf081a7d88f40bf387f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L12768
* https://github.com/llvm/llvm-project/blob/ff9ac33e1e02a7c637b8cdf081a7d88f40bf387f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L12674

It's an identical condition, just written horribly by me :)

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12965
+
+/// Combines a node carrying the intrinsic `aarch64_sve_gather_prf<T>`
+/// into a node that uses
----------------
Hm, it's a bit more like: Check whether we need to remap? Yes - remap, No - all good! The way it's written now suggests that this method will always remap.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12973
+static SDValue performSVEPrefetchVecBaseImmOffCombine(
+    SDNode *N, SelectionDAG &DAG, unsigned NewIID, unsigned ScalarSizeInBytes) {
+  // No need to combine the node if the immediate is valid...
----------------
Broken indentation.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12980
+  // ...otherwise, remap `aarch64_sve_gather_prf<T>` to
+  // `aarch64_sve_gather_prf<T>_scaled_uxtw` by swapping base with the
+  // offset (operand 2 and 3 of N, respectively).
----------------
... and by updating the Intrinsic ID ;-)

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:6484

+  def : Pat<(op (nxv4i32 ZPR32:$Zn), (i64 imm_ty:$imm), (nxv4i1 PPR_3b:$Pg), (i32 sve_prfop:$prfop)),
+            (!cast<Instruction>(NAME) sve_prfop:$prfop, PPR_3b:$Pg, ZPR32:$Zn, imm_ty:$imm)>;
----------------
[nit] Unwritten rule - `Pat` always follow `InstAlias`. 

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:6851

+  def : Pat<(op (nxv2i64 ZPR32:$Zn), (i64 imm_ty:$imm), (nxv2i1 PPR_3b:$Pg), (i32 sve_prfop:$prfop)),
+            (!cast<Instruction>(NAME) sve_prfop:$prfop, PPR_3b:$Pg, ZPR32:$Zn, imm_ty:$imm)>;
----------------
[nit] `Pat` follow `InstAlias`

================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-gather-prefetches-scaled-offset.ll:68
+; CHECK-LABEL: llvm_aarch64_sve_gather_prfh_scaled_uxtw_nx2vi64:
+; CHECK-NEXT:       prfh  pldl1strm, p0, [x0, z0.d, uxtw #1]
+; CHECK-NEXT:  ret
----------------
[nit] Inconsistent formatting

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75580/new/

https://reviews.llvm.org/D75580