[llvm] [RISCV] Match deinterleave(4,8) shuffles to SHL/TRUNC when legal (PR #118509)
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 3 15:25:26 PST 2024
================
@@ -5332,11 +5301,31 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
if (ShuffleVectorInst::isReverseMask(Mask, NumElts) && V2.isUndef())
return DAG.getNode(ISD::VECTOR_REVERSE, DL, VT, V1);
- // If this is a deinterleave and we can widen the vector, then we can use
- // vnsrl to deinterleave.
- if (SDValue Src =
- isDeinterleaveShuffle(VT, ContainerVT, V1, V2, Mask, Subtarget))
- return getDeinterleaveViaVNSRL(DL, VT, Src, Mask[0] == 0, DAG);
+ // If this is a deinterleave(2,4,8) and we can widen the vector, then we can
+ // use shift and truncate to perform the shuffle.
+ // TODO: For Factor=6, we can perform the first step of the deinterleave via
+ // shift-and-trunc reducing total cost for everything except an mf8 result.
+ // TODO: For Factor=4,8, we can do the same when the ratio isn't high enough
+ // to do the entire operation.
+ if (VT.getScalarSizeInBits() < Subtarget.getELen()) {
+ const unsigned MaxFactor = Subtarget.getELen() / VT.getScalarSizeInBits();
+ assert(MaxFactor == 2 || MaxFactor == 4 || MaxFactor == 8);
+ for (unsigned Factor = 2; Factor <= MaxFactor; Factor <<= 1) {
+ unsigned Index = 0;
+ if (ShuffleVectorInst::isDeInterleaveMaskOfFactor(Mask, Factor, Index) &&
+ 1 < count_if(Mask, [](int Idx) { return Idx != -1; })) {
+ if (SDValue Src = getSingleShuffleSrc(VT, ContainerVT, V1, V2)) {
+ if (Src.getValueType() == VT) {
+ EVT WideVT = VT.getDoubleNumVectorElementsVT();
----------------
topperc wrote:
> Are we bothered by the toggles? I can probably eliminate them < m1 by adding back the removed code, but conditional on LMUL. For the > m2 cases, this is probably an improvement. We could potentially treat this a generic improvement to add to e.g. VLOptimizer too.
I agree its probably an improvement for the wide LMUL. I don't think I'm bothered by the toggles. Some of the tests are producing undefined elements in the upper half of the shuffle result and then using them in a later instruction. Is that likely in real code?
https://github.com/llvm/llvm-project/pull/118509
More information about the llvm-commits
mailing list