[PATCH] D131028: [AArch64] Fix cost model for FADD vector reduction

Wed Aug 3 08:52:28 PDT 2022

fhahn added reviewers: dmgreen, t.p.northover.
fhahn added a comment.

Thanks for the patch! The updated costs look like a great improvement.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2591
       {ISD::ADD, MVT::v2i64,  2},
-      {ISD::FADD,MVT::v4f16,  2},
-      {ISD::FADD,MVT::v8f16,  2},
----------------
Those don't appear in the current version on `main`. Is it possible that the patch has been applied on top of some other local changes?

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2617
   case ISD::FADD:
+    {
+      Type *EltType = ValTy->getElementType();
----------------
the formatting seems a bit off here, could you make sure it's formatted with `clang-format-diff`?

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2623
+      unsigned NumVecElts = cast<FixedVectorType>(ValTy)->getNumElements();
+      unsigned int RoundedNumVecElts = pow(2, Log2_32_Ceil(NumVecElts));
+      unsigned int BitsPerElement = EltType->getScalarSizeInBits();
----------------
nit: `std::pow`? It is Laos common to just use `unsigned` instead of `unsigned int`.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2628
+      // While there are more elements than fit into a single register,
+      // We will do element-wise vector adds to reduce
+      InstructionCost VectorFADD;
----------------
nit: reflow comment and lower case `w`. Maybe say `we will use element-wise vector adds to reduce the elements.`

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2636
+      
+      // if elements don't exceed 1 vector register size, they will only be reduced by FADDPs
+      InstructionCost FADDPCosts(Log2_32_Ceil(ElementsLeft));
----------------
nit: Start with uppercase `I`. Maybe say something like `Once the remaining elements fit into a single vector register they will be reduced using pairwise adds (faddp).`

================
Comment at: llvm/test/Analysis/CostModel/AArch64/reduce-fadd.ll:48
   %fadd_v4f16_fast = call fast half @llvm.vector.reduce.fadd.v4f16(half 0.0, <4 x half> undef)
   %fadd_v4f16_reassoc = call reassoc half @llvm.vector.reduce.fadd.v4f16(half 0.0, <4 x half> undef)

----------------
I might have missed it, but could you make sure there's a test with `fp128` as element type?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131028/new/

https://reviews.llvm.org/D131028