[PATCH] D90781: [ARM] remove cost-kind predicate for cmp/sel costs
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 5 06:32:29 PST 2020
spatel added inline comments.
================
Comment at: llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll:220
; SIZE_LATE-LABEL: 'reduce_fmax'
-; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 620 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
+; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 628 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
----------------
samparker wrote:
> I know this is a tiny change, but it's drawn my attention because it's so high the number is so high. Does this look right to you @dmgreen ? I would have thought we were able to break this up more efficiently with our native support, or is this because we'd have to copy the GPR registers into an FPRs to perform some final scalar maxs..?
Stepping through this, the cost is derived from the BasicTTIImpl calling back to the target for shuffle+cmp+sel:
```
// Assume the pairwise shuffles add a cost.
ShuffleCost +=
(IsPairwise + 1) * thisT()->getShuffleCost(TTI::SK_ExtractSubvector,
Ty, NumVecElts, SubTy);
MinMaxCost +=
thisT()->getCmpSelInstrCost(CmpOpcode, SubTy, CondTy,
CmpInst::BAD_ICMP_PREDICATE, CostKind) +
thisT()->getCmpSelInstrCost(Instruction::Select, SubTy, CondTy,
CmpInst::BAD_ICMP_PREDICATE, CostKind);
```
And this progresses for v16f32 as: 384 -> 480 -> 608 for the shuffles and 8 -> 12 -> 20 for the cmp/sel, so 608 + 20 = 628.
The shuffle cost seems to be expanded as a series of insert/extract based on the number of elements in the vector, so it's exploding.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D90781/new/
https://reviews.llvm.org/D90781
More information about the llvm-commits
mailing list