[PATCH] D133955: [AArch64][CostModel] Add costs for fixed operations when using fixed vectors over SVE

Fri May 12 05:48:24 PDT 2023

dtemirbulatov added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2164-2172
+  if (ST->hasSVE() && SrcTy.isFixedLengthVector() &&
+      DstTy.isFixedLengthVector() &&
+      SrcTy.getVectorNumElements() == DstTy.getVectorNumElements()) {
+    EVT WiderTy = SrcTy.bitsGT(DstTy) ? SrcTy : DstTy;
+    std::pair<InstructionCost, MVT> LT =
+        getTypeLegalizationCost(WiderTy.getTypeForEVT(Dst->getContext()));
+    if (LT.second.getFixedSizeInBits() > 128 ||
----------------
sdesmalen wrote:
> dtemirbulatov wrote:
> > sdesmalen wrote:
> > > You can remove a lot of the conditions here.
> > > 
> > > ```WiderTy.getFixedSizeInBits() > 128 ||
> > >         (ST->forceStreamingCompatibleSVE() &&
> > >          (WiderTy.is128BitVector() || WiderTy.is64BitVector()))```
> > > can be replaced by:
> > > ```ST->useSVEForFixedLengthVectors()```
> > > 
> > > You also only need to check that `SrcTy.isFixedLengthVector()`, because it's not possible to convert between scalable/fixed, similarly, the number of elements must match. This means you can change everything to:
> > > 
> > >   if (ST->useSVEForFixedLengthVectors() && SrcTy.isFixedLengthVector()) { 
> > >     EVT WiderTy = SrcTy.bitsGT(DstTy) ? SrcTy : DstTy;
> > >     std::pair<InstructionCost, MVT> LT =
> > >       getTypeLegalizationCost(WiderTy.getTypeForEVT(Dst->getContext()));
> > >     ...
> > >     return AdjustCost(....);
> > >   }
> > > 
> > > and do away with the nested if-condition.
> > ok, but I think I still have to check for DstTy.isFixedLengthVector(). I just recently encountetred case where the source was a fixed length vector and the destanation was something like i128 type. 
> Fair point, I was mostly thinking about vector-vector conversions. Bitcasts between (fixed-length) vectors and scalars should also be considered indeed.
I think we wa

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2175-2176
+          128 / LT.second.getVectorElementType().getSizeInBits();
+      InstructionCost ExtraOps =
+          DstTy.isFloatingPoint() && SrcTy.isFloatingPoint() ? 2 : 1;
+      return AdjustCost(
----------------
sdesmalen wrote:
> It seems odd that you need to add some ExtraOps cost for FP casts, because I would expect that to be represented in the cost returned by `getCastInstrCost`. Do those costs need updating? Or is this related to the subvector inserts/extracts?
yes, I am thinking that at this point we don't want to overcomplicate the proposed change, we can tune it later if any issue occurs.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133955/new/

https://reviews.llvm.org/D133955