[PATCH] D117900: [AArch64][SVE] Fold gather/scatter with 32bits when possible
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 25 14:08:47 PST 2022
sdesmalen added a comment.
The approach seems right to me, but the code is still a little messy so I mostly left a bunch of suggestions/nits to improve readability.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16104
+ SelectionDAG &DAG) {
+ bool ScaledOffset =
+ (IndexType == ISD::SIGNED_SCALED) || (IndexType == ISD::UNSIGNED_SCALED);
----------------
nit: The word 'Offset' is used quite a few times in this function, which confused me a bit. I think what this is saying is that the eventual instruction will scale the offset (i.e. the offset is not an offset, but an index). So maybe change this to `OffsetIsScaledByInstruction`?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16107-16118
+ // Only consider element types that are pointer sized as smaller types can
+ // be easily promoted.
+ if (Index.getValueType().getVectorElementType() != MVT::i64)
+ return false;
+
+ EVT IndexVT = Index.getValueType();
+ assert(IndexVT.getVectorElementType() == MVT::i64 &&
----------------
nit: this is equivalent to:
// Only consider element types that are pointer sized as smaller types can
// be easily promoted. Also ignore indices that are already type legal.
EVT IndexVT = Index.getValueType();
if (IndexVT.getVectorElementType() != MVT::i64 || IndexVT == MVT::nxv2i64)
return false;
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16120
+
+ IndexVT = IndexVT.changeVectorElementType(MVT::i32);
+ int64_t StepConst = 0;
----------------
nit: IndexVT isn't used until line 16173, so don't change it here but rather closest to where it's used. Also, I'd suggest storing it a new variable, named e.g. `NewIndexVT`.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16121
+ IndexVT = IndexVT.changeVectorElementType(MVT::i32);
+ int64_t StepConst = 0;
+ SDLoc DL(N);
----------------
Maybe rename this to `Stride` ?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16127
+ if (Index.getOpcode() == ISD::ADD &&
+ Index.getOperand(1).getOpcode() == ISD::STEP_VECTOR) {
+ auto *C = dyn_cast<ConstantSDNode>(Index.getOperand(1).getOperand(0));
----------------
This makes the assumption that STEP_VECTOR is on the right-hand side. There is currently no code that forces this, so you'll want to add a condition to SelectionDAG::getNode() where it reorders the constants to the right, to do something like this:
```
- if ((IsN1C && !IsN2C) || (IsN1CFP && !IsN2CFP))
+
+ // Favour step_vector on the RHS because it's a kind of
+ // constant.
+ if (((N1.getOpcode() == ISD::STEP_VECTOR || IsN1C) && !IsN2C) ||
+ (IsN1CFP && !IsN2CFP))
```
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16128
+ Index.getOperand(1).getOpcode() == ISD::STEP_VECTOR) {
+ auto *C = dyn_cast<ConstantSDNode>(Index.getOperand(1).getOperand(0));
+ if (!C)
----------------
If `Index.getOperand(1)` is ISD::STEP_VECTOR, then it's operand *must* be a constant. There's no need to test for it, you can call `Index.getOperand(0).getConstantOperandVal(0)` directly.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16134
+ if (ScaledOffset)
+ Offset = DAG.getNode(ISD::MUL, DL, MVT::i64, Offset, N->getScale());
+ BasePtr = DAG.getNode(ISD::ADD, DL, MVT::i64, BasePtr, Offset);
----------------
nit: Does `N->getScale()` always return a value even when ScaledOffset == false?
If so, and it returns a Constant `1` by default, then you can remove the `if (ScaledOffset)` condition above.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16136
+ BasePtr = DAG.getNode(ISD::ADD, DL, MVT::i64, BasePtr, Offset);
+ ChangedIndex = true;
+ }
----------------
If `StepConst` is only set in this block (or in general, when we know for sure that we can fold this into a simpler addressing mode), then ChangedIndex is redundant, and you can just check that StepConst != 0.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16145
+ if (auto Shift = DAG.getSplatValue(Index.getOperand(1)))
+ if (auto Offset = DAG.getSplatValue(Index.getOperand(0).getOperand(0))) {
+ auto *C = dyn_cast<ConstantSDNode>(Shift);
----------------
This isn't very easy to read, I'd recommend creating three variables instead:
if (Index.getOpcode() == ISD::SHL &&
Index.getOperand(0).getOpcode() == ISD::ADD &&
Index.getOperand(0).getOperand(1).getOpcode() == ISD::STEP_VECTOR) {
SDValue ShiftAmount = Index.getOperand(1);
SDValue BaseOffset = Index.getOperand(0).getOperand(0);
SDValue Step = Index.getOperand(0).getOperand(1);
if (auto Shift = DAG.getSplatValue(ShiftAmount))
if (auto *ShiftC = dyn_cast<ConstantSDNode>(Shift))
if (auto Offset = DAG.getSplatValue(BaseOffset)) {
// ... code to calculate Stride, Offset and BasePtr here ...
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16167-16172
+ MachineFunction &MF = DAG.getMachineFunction();
+ llvm::Optional<unsigned int> MaxVScale = 1;
+ if (MF.getFunction().hasFnAttribute(Attribute::VScaleRange))
+ MaxVScale = MF.getFunction()
+ .getFnAttribute(Attribute::VScaleRange)
+ .getVScaleRangeMax();
----------------
You can more easily calculate MaxVScale from:
const auto &Subtarget =
static_cast<const AArch64Subtarget &>(DAG.getSubtarget());
unsigned MaxVScale = Subtarget.getMaxSVEVectorSizeInBits() / AArch64::SVEBitsPerBlock;
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16190
+
+ SDLoc DL(MGS);
+ SDValue Chain = MGS->getChain();
----------------
Maybe just move all these definitions after the `if(!DCI.isBeforeLegalize()) return SDValue();` (that is, if you take my suggestion mentioned below).
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16198
+
+ if (DCI.isBeforeLegalize()) {
+ // Here we catch such cases early and change MGATHER's IndexType to allow
----------------
nit: just bail out early to avoid a level of indentation, e.g.
if (!DCI.isBeforeLegalize())
return SDValue();
if (FindMoreOptimalIndexType(...)) {
...
}
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17678-17681
+ case ISD::MGATHER:
+ return performMaskedGatherScatterCombine(N, DCI, DAG);
+ case ISD::MSCATTER:
+ return performMaskedGatherScatterCombine(N, DCI, DAG);
----------------
nit:
case ISD::MGATHER:
case ISD::MSCATTER:
return performMaskedGatherScatterCombine(N, DCI, DAG);
?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117900/new/
https://reviews.llvm.org/D117900
More information about the llvm-commits
mailing list