[PATCH] D155299: [AArch64][SVE2] Combine add+lsr to rshrnb for stores

Matt Devereau via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 7 01:46:58 PDT 2023

MattDevereau added inline comments.

Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20085
+                                         const AArch64Subtarget *Subtarget) {
+  EVT VT = Srl->getValueType(0);
+  if (!Subtarget->hasSVE2() || Srl->getOpcode() != ISD::SRL ||
dmgreen wrote:
> Can you either add a check that the VT is one that is valid for RSHRNB, or an assert that we only handle those types?
Since were matching the Rshrnb ISD node emitted with the patterns in tablegen that match `nvx8i16 -> nxv16i8`, `nxv4i32 -> nxv8i16`, `nxv2i64 -> nxv4i32` I've added an explicit block for this now:

  EVT ResVT;
  if (VT == MVT::nxv8i16)
    ResVT = MVT::nxv16i8;
  else if (VT == MVT::nxv4i32)
    ResVT = MVT::nxv8i16;
  else if (VT == MVT::nxv2i64)
    ResVT = MVT::nxv4i32;
    return SDValue();
We can use ResVT for the SDNode creation of Rshrnb and use VT for the bitcast back to the type that fits the DAG correctly.

Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20105
+  if (AddValue != 1 << (ShiftValue - 1))
+    return SDValue();
dmgreen wrote:
> This may need to be `1ULL << (ShiftValue - 1)`, as the number could be a 64bit value.
Done, and changed `int64_t AddValue` to `uint64_t AddValue` to line up with the change.

Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20109-20111
+  auto Rshrnb = DAG.getNode(
+      AArch64ISD::RSHRNB_I, DL, VT,
+      {Add->getOperand(0), DAG.getTargetConstant(ShiftValue, DL, MVT::i32)});
dmgreen wrote:
> ```
> SDValue Rshrnb = DAG.getNode(
>       AArch64ISD::RSHRNB_I, DL, VT,
>       Add->getOperand(0), DAG.getTargetConstant(ShiftValue, DL, MVT::i32));
> ```
> (Or just return it directly)
This is directly returning a bitcast SDNode back to the original VT now.

Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20150
+  if (SDValue Rshrnb =
+          trySimplifySrlAddToRshrnb(cast<SDNode>(Op0), DAG, Subtarget))
+    return DAG.getNode(AArch64ISD::UZP1, DL, ResVT, Rshrnb, Op1);
dmgreen wrote:
> trySimplifySrlAddToRshrnb could take a SDValue, instead of needing the cast.
I get the feeling this is a side-grade since I need to pass in SDLoc as an extra parameter this way, but I can see the cast looks a bit ugly. I'm not fussy on this so I'll go with your suggestion.

  rG LLVM Github Monorepo



More information about the llvm-commits mailing list