[PATCH] D148123: [AArch64][CostModel] Make sext/zext free if folded into a masked load

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 18 01:17:50 PDT 2023


david-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2131
+      (CCH == TTI::CastContextHint::Masked) && ST->hasSVEorSME())
+    CCH = TTI::CastContextHint::Normal;
+
----------------
david-arm wrote:
> dmgreen wrote:
> > Is this always true that they are equivalent?
> > 
> > For a `nxv8i16 load zext to nxv8i32` (without masking) you can convert it into a pair of extending load (each cost 1, so load+zext costs 2 total).
> > 
> > The same can't be done for `nxv8i16 masked_load zext to nxv8i32` without either converting the nxv8i1 mask to two nxv4i1 masks, or zext a single load with a pair of uunpk's. (For MVE both are expensive so we give the instruction a high cost, preferring lower vector factors).
> Good point! I think the new version should fix that.
This is in BasicTTIImpl::getCastInstrCost:

```      // If we are legalizing by splitting, query the concrete TTI for the cost
      // of casting the original vector twice. We also need to factor in the
      // cost of the split itself. Count that as 1, to be consistent with
      // getTypeLegalizationCost().
      bool SplitSrc =
          TLI->getTypeAction(Src->getContext(), TLI->getValueType(DL, Src)) ==
          TargetLowering::TypeSplitVector;
      bool SplitDst =
          TLI->getTypeAction(Dst->getContext(), TLI->getValueType(DL, Dst)) ==
          TargetLowering::TypeSplitVector;
      if ((SplitSrc || SplitDst) && SrcVTy->getElementCount().isVector() &&
          DstVTy->getElementCount().isVector()) {
        Type *SplitDstTy = VectorType::getHalfElementsVectorType(DstVTy);
        Type *SplitSrcTy = VectorType::getHalfElementsVectorType(SrcVTy);
        T *TTI = static_cast<T *>(this);
        // If both types need to be split then the split is free.
        InstructionCost SplitCost =
            (!SplitSrc || !SplitDst) ? TTI->getVectorSplitCost() : 0;
        return SplitCost +
               (2 * TTI->getCastInstrCost(Opcode, SplitDstTy, SplitSrcTy, CCH,
                                          CostKind, I));
      }```

which explains what's happening. It splits the types, then recalculates the zext/sext when the dest is a legal type. It just so happens that this becomes a legal extending load, which is correct! So the extend is absorbed into each load and becomes free. The only cost is then the `SplitCost`, which I guess could account for the additional load required.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148123/new/

https://reviews.llvm.org/D148123



More information about the llvm-commits mailing list