[PATCH] D115462: [SLP]Improve shuffles cost estimation where possible.
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu May 26 09:57:34 PDT 2022
ABataev added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:6052
Cost += TTI->getShuffleCost(
- TargetTransformInfo::SK_PermuteSingleSrc,
- FixedVectorType::get(SrcVecTy->getElementType(), Sz));
- } else if (!IsIdentity) {
- auto *FirstInsert =
- cast<Instruction>(*find_if(E->Scalars, [E](Value *V) {
- return !is_contained(E->Scalars,
- cast<Instruction>(V)->getOperand(0));
- }));
- if (isUndefVector(FirstInsert->getOperand(0))) {
- Cost += TTI->getShuffleCost(TTI::SK_PermuteSingleSrc, SrcVecTy, Mask);
- } else {
- SmallVector<int> InsertMask(NumElts);
- std::iota(InsertMask.begin(), InsertMask.end(), 0);
- for (unsigned I = 0; I < NumElts; I++) {
- if (Mask[I] != UndefMaskElem)
- InsertMask[Offset + I] = NumElts + I;
- }
- Cost +=
- TTI->getShuffleCost(TTI::SK_PermuteTwoSrc, SrcVecTy, InsertMask);
- }
- }
+ TTI::SK_Select,
+ NumOfParts > 0
----------------
RKSimon wrote:
> ABataev wrote:
> > dmgreen wrote:
> > > ABataev wrote:
> > > > dmgreen wrote:
> > > > > I'm not sure I understand why this would be a SK_Select. That is a bit of a X86 special as far as I understand and doesn't always correlate well to other architectures. Why is the Mask missing too? That might be enough to help avoid the regressions if it was re-added.
> > > > 1. It is a permuatation of 2 sub-vectors: the root of the buildvector and a subvector after the vectorization. Since it was a buildvector, the compiler selects elements from the root and corresponding elements from the resulting vector.
> > > >
> > > > 2. Mask is not required, if TTI::SK_Select is used, mask is used only with SK_PermuteSingleSrc and SK_PermuteTwoSrc.
> > > >
> > > > But I'll check it.
> > > AArch64 (and most other architectures AFAIU) do not have SK_Select shuffles, so is not a lot better than SK_PermuteTwoSrc. A Mask can help to improve the cost though, if the backend can come up with something more accurate for it.
> > >
> > > I'm surprised this is not a SK_InsertSubvector with adjacent elements though - that seems like the most natural fit, unless I'm missing how this works.
> > Yep, you right, it must be an InserSubvector kind, changed it to Select because some cost for InsertSubvector were not implemented.
> was this on x86 / aarch64 or some other target?
x86, IIRC.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115462/new/
https://reviews.llvm.org/D115462
More information about the llvm-commits
mailing list