[PATCH] D152205: [Aarch64][SVE]SVE2] Enable tbl, tbl2 for shuffle lowering for fixed vector types.
Caroline via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 24 08:57:42 PDT 2023
CarolineConcatto added a comment.
Hi Dinar,
I re-wrote your code in the Lowering to something I could understand better.
I will still spend some time trying to understand the mask logic and see if the new scalable vector is correct.
But I believe it is good to share my comments for now
+ // Avoid producing TBL and TBL2 instructions if we don't know
+ // SVE register size or minimal is unequal to maximum size.
+ if (!MaxSVESize || MinSVESize != MaxSVESize)
+ return SDValue();
- return SDValue();
+ // Element type is ???
+ unsigned BitsPerElt = VT.getVectorElementType().getSizeInBits();
+ if (BitsPerElt != 8 && BitsPerElt != 16 && BitsPerElt != 32 &&
+ BitsPerElt != 64)
+ return SDValue();
+
+ bool Swap = false;
+ if (Op1.isUndef() || isZerosVector(Op1.getNode())) {
+ // Swapped shuffle operands, but we have to recalculate indexes afterwards.
+ std::swap(Op1, Op2);
+ Swap = true;
+ }
+
+
+ bool IsUndefOrZero = Op2.isUndef() || isZerosVector(Op2.getNode());
+ unsigned NumElts = VT.getVectorNumElements();
+ unsigned IndexLen = MinSVESize / BitsPerElt;
+ unsigned FillElements = IndexLen - NumElts;
+
+ // Set the Mask with the correct indexes
+ SmallVector<SDValue, 8> TBLMask;
+ for (int Val : ShuffleMask) {
+ unsigned Offset = Val;
+ if (Swap)
+ Offset = Offset < NumElts ? Offset + NumElts : Offset - NumElts;
+ // why we need to set the mask to 255 when one of the vectors is zero/undef?
+ else if(Offset >= NumElts){
+ if(IsUndefOrZero)
+ Offset = 255;
+ else if (IndexLen > NumElts)
+ Offset += IndexLen - NumElts;
+ }
+ TBLMask.push_back(DAG.getConstant(Offset, DL, MVT::i64));
+ }
+ // Filling the rest of the mask with 255
+ for (unsigned i = 0; i < FillElements; ++i)
+ TBLMask.push_back(DAG.getConstant(255, DL, MVT::i64));
+
+ // Building a scalable vector for the suffle. So we can use the sve instructions
+ EVT MaskEltType = EVT::getIntegerVT(*DAG.getContext(), BitsPerElt);
+ EVT MaskType = EVT::getVectorVT(*DAG.getContext(), MaskEltType, IndexLen);
+ EVT MaskContainerVT = getContainerForFixedLengthVector(DAG, MaskType);
+ SDValue VecMask =
+ DAG.getBuildVector(MaskType, DL, ArrayRef(TBLMask.data(), IndexLen));
+ SDValue SVEMask = convertToScalableVector(DAG, MaskContainerVT, VecMask);
+
+ SDValue Shuffle;
+ if (IsUndefOrZero || Swap) {
+ Shuffle = convertFromScalableVector(
+ DAG, VT,
+ DAG.getNode(ISD::INTRINSIC_WO_CHAIN, DL, ContainerVT,
+ DAG.getConstant(Intrinsic::aarch64_sve_tbl, DL, MVT::i32),
+ Op1, SVEMask));
+ } else {
+ if (Subtarget->hasSVE2())
+ Shuffle = convertFromScalableVector(
+ DAG, VT,
+ DAG.getNode(
+ ISD::INTRINSIC_WO_CHAIN, DL, ContainerVT,
+ DAG.getConstant(Intrinsic::aarch64_sve_tbl2, DL, MVT::i32), Op1,
+ Op2, SVEMask));
+ else
+ return SDValue();
+ }
+ return DAG.getNode(ISD::BITCAST, DL, Op.getValueType(), Shuffle);
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25748-25749
+
+ if (BitsPerElt != 8 && BitsPerElt != 16 && BitsPerElt != 32 &&
+ BitsPerElt != 64)
+ return SDValue();
----------------
Why do we need to test this? What other value it can have?
I thought the only possible values were this 8/16/32/64. Is it possible to have 128, 1024? Are these values we are avoiding?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25761
+ Offset = 255;
+ else if (Offset >= NumElts && FillElements)
+ Offset += IndexLen - NumElts;
----------------
For me this tests is a bit odd.
FillElements is unsigned and not a bool. So are you checking if IndexLen> NumElts?
Is it too difficult to put and example for each case with the fixed mask. So we can see how the mask was and how it has changed.
if (Swap)
for Offset < NumElts ->mask = <0,0,7,7> -> <3,3,0,0>
for Offset > NumElts ->mask = <0,0,7,7>
Maybe this will help to understand if there is any hidden problem with these ifs.
I re-wrote that to something like:
```
// Set the Mask with the correct indexes
SmallVector<SDValue, 8> TBLMask;
for (int Val : ShuffleMask) {
unsigned Offset = Val;
if (Swap)
Offset = Offset < NumElts ? Offset + NumElts : Offset - NumElts;
// why we need to set the mask to 255 when one of the vectors is zero/undef?
else if(Offset >= NumElts){
if(IsUndefOrZero)
Offset = 255;
else if (IndexLen > NumElts)
Offset += IndexLen - NumElts;
}
TBLMask.push_back(DAG.getConstant(Offset, DL, MVT::i64));
}
```
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25765
+ }
+ for (unsigned i = 0; i < FillElements; ++i)
+ TBLMask.push_back(DAG.getConstant(255, DL, MVT::i64));
----------------
I would suggest add a comment here.
You are populating the rest of the mask here with 255, why?
// Filling up the remaining positions of the mask with 255 because ...
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25768
+
+ EVT MaskContainerVT = getContainerForFixedLengthVector(DAG, MaskType);
+ SDValue VecMask =
----------------
Are you here changing a fixed vector to a scalable vector to use tbl?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D152205/new/
https://reviews.llvm.org/D152205
More information about the llvm-commits
mailing list