[PATCH] D117900: [AArch64][SVE] Fold gather/scatter with 32bits when possible
Caroline via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 27 03:24:55 PST 2022
CarolineConcatto added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16127
+ if (Index.getOpcode() == ISD::ADD &&
+ Index.getOperand(1).getOpcode() == ISD::STEP_VECTOR) {
+ auto *C = dyn_cast<ConstantSDNode>(Index.getOperand(1).getOperand(0));
----------------
sdesmalen wrote:
> This makes the assumption that STEP_VECTOR is on the right-hand side. There is currently no code that forces this, so you'll want to add a condition to SelectionDAG::getNode() where it reorders the constants to the right, to do something like this:
> ```
> - if ((IsN1C && !IsN2C) || (IsN1CFP && !IsN2CFP))
> +
> + // Favour step_vector on the RHS because it's a kind of
> + // constant.
> + if (((N1.getOpcode() == ISD::STEP_VECTOR || IsN1C) && !IsN2C) ||
> + (IsN1CFP && !IsN2CFP))
> ```
I don't think it is a good solution to add this in SelectionDAG.
I could be wrong, but when I do this suggestion I see regressions in sve-stepvector.ll and sve-intrinsics-index. Instead on only 1 instruction it becomes 5.
For instance,
```
define <vscale x 16 x i8> @index_rr_i8(i8 %a, i8 %b) {
; CHECK-LABEL: index_rr_i8:
; CHECK: // %bb.0:
-; CHECK-NEXT: index z0.b, w0, w1
+; CHECK-NEXT: index z1.b, #0, #1
+; CHECK-NEXT: mov z2.b, w1
+; CHECK-NEXT: mov z0.b, w0
+; CHECK-NEXT: ptrue p0.b
+; CHECK-NEXT: mla z0.b, p0/m, z2.b, z1.b
```
I could add here, but as you saw the C++ code in here would not be nicer.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16134
+ if (ScaledOffset)
+ Offset = DAG.getNode(ISD::MUL, DL, MVT::i64, Offset, N->getScale());
+ BasePtr = DAG.getNode(ISD::ADD, DL, MVT::i64, BasePtr, Offset);
----------------
sdesmalen wrote:
> nit: Does `N->getScale()` always return a value even when ScaledOffset == false?
> If so, and it returns a Constant `1` by default, then you can remove the `if (ScaledOffset)` condition above.
As we also discussed the test
```
if (ScaledOffset)
```
could be removed and Offset could always be multiplied by N->getScale() as it will never be zero.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16136
+ BasePtr = DAG.getNode(ISD::ADD, DL, MVT::i64, BasePtr, Offset);
+ ChangedIndex = true;
+ }
----------------
sdesmalen wrote:
> If `StepConst` is only set in this block (or in general, when we know for sure that we can fold this into a simpler addressing mode), then ChangedIndex is redundant, and you can just check that StepConst != 0.
So my worried was that the constant could be zero.
But I checked and this would remove step vector
The code is optimized and step is removed, for the correct reason.
Index = step(const) * splat(0) -> Index = splat(0)
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16145
+ if (auto Shift = DAG.getSplatValue(Index.getOperand(1)))
+ if (auto Offset = DAG.getSplatValue(Index.getOperand(0).getOperand(0))) {
+ auto *C = dyn_cast<ConstantSDNode>(Shift);
----------------
sdesmalen wrote:
> This isn't very easy to read, I'd recommend creating three variables instead:
>
> if (Index.getOpcode() == ISD::SHL &&
> Index.getOperand(0).getOpcode() == ISD::ADD &&
> Index.getOperand(0).getOperand(1).getOpcode() == ISD::STEP_VECTOR) {
> SDValue ShiftAmount = Index.getOperand(1);
> SDValue BaseOffset = Index.getOperand(0).getOperand(0);
> SDValue Step = Index.getOperand(0).getOperand(1);
> if (auto Shift = DAG.getSplatValue(ShiftAmount))
> if (auto *ShiftC = dyn_cast<ConstantSDNode>(Shift))
> if (auto Offset = DAG.getSplatValue(BaseOffset)) {
> // ... code to calculate Stride, Offset and BasePtr here ...
I pushed this fold to be in a second patch, maybe it will be better to review.
In case not I will add these extra variables.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117900/new/
https://reviews.llvm.org/D117900
More information about the llvm-commits
mailing list