[PATCH] D133955: [AArch64][CostModel] Add costs for fixed operations when using fixed vectors over SVE

Thu May 4 08:36:55 PDT 2023

sdesmalen added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2164-2172
+  if (ST->hasSVE() && SrcTy.isFixedLengthVector() &&
+      DstTy.isFixedLengthVector() &&
+      SrcTy.getVectorNumElements() == DstTy.getVectorNumElements()) {
+    EVT WiderTy = SrcTy.bitsGT(DstTy) ? SrcTy : DstTy;
+    std::pair<InstructionCost, MVT> LT =
+        getTypeLegalizationCost(WiderTy.getTypeForEVT(Dst->getContext()));
+    if (LT.second.getFixedSizeInBits() > 128 ||
----------------
You can remove a lot of the conditions here.

```WiderTy.getFixedSizeInBits() > 128 ||
        (ST->forceStreamingCompatibleSVE() &&
         (WiderTy.is128BitVector() || WiderTy.is64BitVector()))```
can be replaced by:
```ST->useSVEForFixedLengthVectors()```

You also only need to check that `SrcTy.isFixedLengthVector()`, because it's not possible to convert between scalable/fixed, similarly, the number of elements must match. This means you can change everything to:

  if (ST->useSVEForFixedLengthVectors() && SrcTy.isFixedLengthVector()) { 
    EVT WiderTy = SrcTy.bitsGT(DstTy) ? SrcTy : DstTy;
    std::pair<InstructionCost, MVT> LT =
      getTypeLegalizationCost(WiderTy.getTypeForEVT(Dst->getContext()));
    ...
    return AdjustCost(....);
  }

and do away with the nested if-condition.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2174
+      unsigned NumElements =
+          128 / LT.second.getVectorElementType().getSizeInBits();
+      InstructionCost ExtraOps =
----------------
Please use `AArch64::SVEBitsPerBlock` instead of hard-coding 128.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2175-2176
+          128 / LT.second.getVectorElementType().getSizeInBits();
+      InstructionCost ExtraOps =
+          DstTy.isFloatingPoint() && SrcTy.isFloatingPoint() ? 2 : 1;
+      return AdjustCost(
----------------
It seems odd that you need to add some ExtraOps cost for FP casts, because I would expect that to be represented in the cost returned by `getCastInstrCost`. Do those costs need updating? Or is this related to the subvector inserts/extracts?

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2181-2182
+              Opcode,
+              ScalableVectorType::get(cast<VectorType>(Dst)->getElementType(),
+                                      NumElements),
+              ScalableVectorType::get(cast<VectorType>(Src)->getElementType(),
----------------
nit: can you create a utility function for this, similar to what we've done for `getContainerForFixedLengthVector` in AArch64ISelLowering.cpp, but then one that returns a `Type *`.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1818
   EVT DstTy = TLI->getValueType(DL, Dst);
+  if (SrcTy == DstTy)
+    return 0;
----------------
This conditional return seems redundant, because when I remove this change, the tests still pass.

================
Comment at: llvm/test/Analysis/CostModel/AArch64/cast.ll:4
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=aarch64-none-linux-gnueabi -mattr=+fullfp16 %s | FileCheck --check-prefixes=CHECK,CHECK-FP16 %s
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=aarch64-none-linux-gnueabi -mattr=+sve -scalable-vectorization=off -aarch64-sve-vector-bits-min=256 %s | FileCheck --check-prefixes=SVE,FIXED-OVER-SVE256 %s
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=aarch64-none-linux-gnueabi -mattr=+sve -scalable-vectorization=off -aarch64-sve-vector-bits-min=2048 %s | FileCheck --check-prefixes=SVE,FIXED-OVER-SVE2048 %s
----------------
this flag is not relevant to this test, because no vectorization is done by `print<cost-model>`.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133955/new/

https://reviews.llvm.org/D133955