[PATCH] D131028: [AArch64] Fix cost model for FADD vector reduction
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 3 08:52:28 PDT 2022
fhahn added reviewers: dmgreen, t.p.northover.
fhahn added a comment.
Thanks for the patch! The updated costs look like a great improvement.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2591
{ISD::ADD, MVT::v2i64, 2},
- {ISD::FADD,MVT::v4f16, 2},
- {ISD::FADD,MVT::v8f16, 2},
----------------
Those don't appear in the current version on `main`. Is it possible that the patch has been applied on top of some other local changes?
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2617
case ISD::FADD:
+ {
+ Type *EltType = ValTy->getElementType();
----------------
the formatting seems a bit off here, could you make sure it's formatted with `clang-format-diff`?
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2623
+ unsigned NumVecElts = cast<FixedVectorType>(ValTy)->getNumElements();
+ unsigned int RoundedNumVecElts = pow(2, Log2_32_Ceil(NumVecElts));
+ unsigned int BitsPerElement = EltType->getScalarSizeInBits();
----------------
nit: `std::pow`? It is Laos common to just use `unsigned` instead of `unsigned int`.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2628
+ // While there are more elements than fit into a single register,
+ // We will do element-wise vector adds to reduce
+ InstructionCost VectorFADD;
----------------
nit: reflow comment and lower case `w`. Maybe say `we will use element-wise vector adds to reduce the elements.`
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2636
+
+ // if elements don't exceed 1 vector register size, they will only be reduced by FADDPs
+ InstructionCost FADDPCosts(Log2_32_Ceil(ElementsLeft));
----------------
nit: Start with uppercase `I`. Maybe say something like `Once the remaining elements fit into a single vector register they will be reduced using pairwise adds (faddp).`
================
Comment at: llvm/test/Analysis/CostModel/AArch64/reduce-fadd.ll:48
%fadd_v4f16_fast = call fast half @llvm.vector.reduce.fadd.v4f16(half 0.0, <4 x half> undef)
%fadd_v4f16_reassoc = call reassoc half @llvm.vector.reduce.fadd.v4f16(half 0.0, <4 x half> undef)
----------------
I might have missed it, but could you make sure there's a test with `fp128` as element type?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D131028/new/
https://reviews.llvm.org/D131028
More information about the llvm-commits
mailing list