[PATCH] D155299: [AArch64][SVE2] Combine add+lsr to rshrnb for stores
Matt Devereau via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 7 01:46:58 PDT 2023
MattDevereau added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20085
+ const AArch64Subtarget *Subtarget) {
+ EVT VT = Srl->getValueType(0);
+ if (!Subtarget->hasSVE2() || Srl->getOpcode() != ISD::SRL ||
----------------
dmgreen wrote:
> Can you either add a check that the VT is one that is valid for RSHRNB, or an assert that we only handle those types?
Since were matching the Rshrnb ISD node emitted with the patterns in tablegen that match `nvx8i16 -> nxv16i8`, `nxv4i32 -> nxv8i16`, `nxv2i64 -> nxv4i32` I've added an explicit block for this now:
```
EVT ResVT;
if (VT == MVT::nxv8i16)
ResVT = MVT::nxv16i8;
else if (VT == MVT::nxv4i32)
ResVT = MVT::nxv8i16;
else if (VT == MVT::nxv2i64)
ResVT = MVT::nxv4i32;
else
return SDValue();
```
We can use ResVT for the SDNode creation of Rshrnb and use VT for the bitcast back to the type that fits the DAG correctly.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20105
+
+ if (AddValue != 1 << (ShiftValue - 1))
+ return SDValue();
----------------
dmgreen wrote:
> This may need to be `1ULL << (ShiftValue - 1)`, as the number could be a 64bit value.
Done, and changed `int64_t AddValue` to `uint64_t AddValue` to line up with the change.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20109-20111
+ auto Rshrnb = DAG.getNode(
+ AArch64ISD::RSHRNB_I, DL, VT,
+ {Add->getOperand(0), DAG.getTargetConstant(ShiftValue, DL, MVT::i32)});
----------------
dmgreen wrote:
> ```
> SDValue Rshrnb = DAG.getNode(
> AArch64ISD::RSHRNB_I, DL, VT,
> Add->getOperand(0), DAG.getTargetConstant(ShiftValue, DL, MVT::i32));
> ```
> (Or just return it directly)
This is directly returning a bitcast SDNode back to the original VT now.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:20150
+ if (SDValue Rshrnb =
+ trySimplifySrlAddToRshrnb(cast<SDNode>(Op0), DAG, Subtarget))
+ return DAG.getNode(AArch64ISD::UZP1, DL, ResVT, Rshrnb, Op1);
----------------
dmgreen wrote:
> trySimplifySrlAddToRshrnb could take a SDValue, instead of needing the cast.
I get the feeling this is a side-grade since I need to pass in SDLoc as an extra parameter this way, but I can see the cast looks a bit ugly. I'm not fussy on this so I'll go with your suggestion.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D155299/new/
https://reviews.llvm.org/D155299
More information about the llvm-commits
mailing list