[llvm] [RISCV][CG]Use processShuffleMasks for per-register shuffles (PR #120803)
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 2 09:39:54 PST 2025
================
@@ -5120,58 +5119,70 @@ static SDValue lowerShuffleViaVRegSplitting(ShuffleVectorSDNode *SVN,
MVT ElemVT = VT.getVectorElementType();
unsigned ElemsPerVReg = *VLen / ElemVT.getFixedSizeInBits();
- unsigned VRegsPerSrc = NumElts / ElemsPerVReg;
-
- SmallVector<std::pair<int, SmallVector<int>>>
- OutMasks(VRegsPerSrc, {-1, {}});
-
- // Check if our mask can be done as a 1-to-1 mapping from source
- // to destination registers in the group without needing to
- // write each destination more than once.
- for (unsigned DstIdx = 0; DstIdx < Mask.size(); DstIdx++) {
- int DstVecIdx = DstIdx / ElemsPerVReg;
- int DstSubIdx = DstIdx % ElemsPerVReg;
- int SrcIdx = Mask[DstIdx];
- if (SrcIdx < 0 || (unsigned)SrcIdx >= 2 * NumElts)
- continue;
- int SrcVecIdx = SrcIdx / ElemsPerVReg;
- int SrcSubIdx = SrcIdx % ElemsPerVReg;
- if (OutMasks[DstVecIdx].first == -1)
- OutMasks[DstVecIdx].first = SrcVecIdx;
- if (OutMasks[DstVecIdx].first != SrcVecIdx)
- // Note: This case could easily be handled by keeping track of a chain
----------------
preames wrote:
Possibly, yes. In particular, using a quadratic number of shuffles (with distinct indices and masks) vs a single one is a *lot* of code size increase.
I suspect this needs a bit of thought and investigation. I am not proposing any particular heuristic, and am open to being convinced that the right heuristic is to just blindly expand. I'd just like to see it explored and justified.
Though 1-2 shuffles is way to low a threshold. You definitely want something which allows at least the linear expansion of the code you replaced.
https://github.com/llvm/llvm-project/pull/120803
More information about the llvm-commits
mailing list