[PATCH] D133955: [AArch64][CostModel] Add costs for fixed operations when using fixed vectors over SVE

Thu Feb 9 13:37:43 PST 2023

dtemirbulatov added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1938-1942
+      const unsigned NumElements =
+          ((WiderTy.getFixedSizeInBits() / ST->getMinSVEVectorSizeInBits()) <=
+           1)
+              ? 128 / WiderTy.getVectorElementType().getSizeInBits()
+              : WiderTy.getVectorNumElements();
----------------
paulwalker-arm wrote:
> sdesmalen wrote:
> > Right now your code assumes that if `min-sve-vector-bits=256` and `WideVT = <8 x double>` (which means the legalisation requires a multiplier of 2), that the scalable type must be `<vscale x 8 x double>` which is a multiplier of 4. So that doesn't seem right.
> I think you've misunderstood my previous comment as now we're back to missing the conversion from fixed length num elts to scalable num elts.  I know I've mentioned this before and it seemed a dead end but I going to ask again.  Why can we not use the existing legalisation cost functions?  I'm think something like:
> 
> ```
> std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(WiderType);
> if (LT.second.getFixedSizeInBits() > 128 || ST->forceStreamingCompatibleSVE()) {                                                      
>       unsigned NumElements = 128 / LT.second.getVectorElementType().getSizeInBits();
> 
> ....
> 
> return AdjustCost(LT.first * getCastInstrCost(....
> ```
> Doing this means we don't need to worry about `getMinSVEVectorSizeInBits`.
This might be simplification, but here is the logic behind those calculation. With "min-sve-vector-bits=256" we assume that the hardware supports <vscale x 4 x double>, so as minimum we can operate with <4 x double> with a fixed type and we could double the cost to operate with <8 x double>.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133955/new/

https://reviews.llvm.org/D133955