[PATCH] D148123: [AArch64][CostModel] Make sext/zext free if folded into a masked load
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 18 01:17:50 PDT 2023
david-arm added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2131
+ (CCH == TTI::CastContextHint::Masked) && ST->hasSVEorSME())
+ CCH = TTI::CastContextHint::Normal;
+
----------------
david-arm wrote:
> dmgreen wrote:
> > Is this always true that they are equivalent?
> >
> > For a `nxv8i16 load zext to nxv8i32` (without masking) you can convert it into a pair of extending load (each cost 1, so load+zext costs 2 total).
> >
> > The same can't be done for `nxv8i16 masked_load zext to nxv8i32` without either converting the nxv8i1 mask to two nxv4i1 masks, or zext a single load with a pair of uunpk's. (For MVE both are expensive so we give the instruction a high cost, preferring lower vector factors).
> Good point! I think the new version should fix that.
This is in BasicTTIImpl::getCastInstrCost:
``` // If we are legalizing by splitting, query the concrete TTI for the cost
// of casting the original vector twice. We also need to factor in the
// cost of the split itself. Count that as 1, to be consistent with
// getTypeLegalizationCost().
bool SplitSrc =
TLI->getTypeAction(Src->getContext(), TLI->getValueType(DL, Src)) ==
TargetLowering::TypeSplitVector;
bool SplitDst =
TLI->getTypeAction(Dst->getContext(), TLI->getValueType(DL, Dst)) ==
TargetLowering::TypeSplitVector;
if ((SplitSrc || SplitDst) && SrcVTy->getElementCount().isVector() &&
DstVTy->getElementCount().isVector()) {
Type *SplitDstTy = VectorType::getHalfElementsVectorType(DstVTy);
Type *SplitSrcTy = VectorType::getHalfElementsVectorType(SrcVTy);
T *TTI = static_cast<T *>(this);
// If both types need to be split then the split is free.
InstructionCost SplitCost =
(!SplitSrc || !SplitDst) ? TTI->getVectorSplitCost() : 0;
return SplitCost +
(2 * TTI->getCastInstrCost(Opcode, SplitDstTy, SplitSrcTy, CCH,
CostKind, I));
}```
which explains what's happening. It splits the types, then recalculates the zext/sext when the dest is a legal type. It just so happens that this becomes a legal extending load, which is correct! So the extend is absorbed into each load and becomes free. The only cost is then the `SplitCost`, which I guess could account for the additional load required.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D148123/new/
https://reviews.llvm.org/D148123
More information about the llvm-commits
mailing list