[PATCH] D133955: [AArch64][CostModel] Add costs for fixed operations when using fixed vectors over SVE

Wed Dec 14 08:22:09 PST 2022

sdesmalen added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1930
+  if (ST->hasSVE() && ST->getMinSVEVectorSizeInBits() &&
+      SrcTy.isFixedLengthVector() && DstTy.isFixedLengthVector() &&
+      SrcTy != DstTy) {
----------------
I expect that this condition is redundant, because if `SrcTy.isFixedLengthVector() == true`, then `DstTy` must also be a fixed-length vector, otherwise the IR would not be valid. Can you change this into an `assert`?

Also, I see you've marked Paul's comment about using `useSVEForFixedLengthVectors` as Done, but I don't see the code using it. I think this condition can just be:

  if (SrcTy.isFixedLengthVector() && useSVEForFixedLengthVectors() &&
      (SrcVT.getFixedSizeInBits() > 128 || DstVT.getFixedSizeInBits() > 128 ||
       ST->forceStreamingCompatibleSVE())) {
    ...
  }

Which then removes the need for the extra condition below.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1931
+      SrcTy.isFixedLengthVector() && DstTy.isFixedLengthVector() &&
+      SrcTy != DstTy) {
+    EVT WiderTy = (SrcTy.getFixedSizeInBits() > DstTy.getFixedSizeInBits())
----------------
I guess this condition can be hoisted up at the start of this function to become:

  if (SrcTy == DstTy)
    return 0;

Because that operation would be a nop.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1938-1942
+      const unsigned NumElements =
+          ((WiderTy.getFixedSizeInBits() / ST->getMinSVEVectorSizeInBits()) <=
+           1)
+              ? 128 / WiderTy.getVectorElementType().getSizeInBits()
+              : WiderTy.getVectorNumElements();
----------------
Right now your code assumes that if `min-sve-vector-bits=256` and `WideVT = <8 x double>` (which means the legalisation requires a multiplier of 2), that the scalable type must be `<vscale x 8 x double>` which is a multiplier of 4. So that doesn't seem right.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1948
+                          true),
+          VectorType::get(cast<VectorType>(Src)->getElementType(), NumElements,
+                          true),
----------------
I guess this only works if NumElements is the same for both Src and Dst, i.e. a (zero-cost) `bitcast` where the number of elements may differ should not be handled here. Perhaps that condition should be added to the outer condition that I suggested above.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133955/new/

https://reviews.llvm.org/D133955