[llvm] [AArch64][CostModel] Reduce the cost of fadd reduction with fast flag (PR #108791)
Sushant Gokhale via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 17 10:32:21 PDT 2024
================
@@ -4147,6 +4147,22 @@ AArch64TTIImpl::getArithmeticReductionCost(unsigned Opcode, VectorType *ValTy,
switch (ISD) {
default:
break;
+ case ISD::FADD: {
+ if (MTy.isVector()) {
+ // FIXME: Consider cases where the number of vector elements is not power
+ // of 2.
+ const unsigned NElts = MTy.getVectorNumElements();
+ if (ValTy->getElementCount().getFixedValue() >= 2 && NElts >= 2 &&
+ isPowerOf2_32(NElts)) {
----------------
sushgokh wrote:
> The cost without fp16:
>
> ```
> ; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %fadd_v8f16 = call fast half @llvm.vector.reduce.fadd.v8f16(half 0xH0000, <8 x half> undef)
> ```
>
> seems to be the same as with fp16, I would expect it to be higher.
>
> ```
> ; FP16-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %fadd_v8f16 = call fast half @llvm.vector.reduce.fadd.v8f16(half 0xH0000, <8 x half> undef)
> ```
>
> It might be that the type legalization cost doesn't account for it as the type is still legal to some extent.
ok, I will add a check:
```
if( type == half && fp16enabled() )
return <patch cost>
else
fallback to current cost
```
does the above check look good?
https://github.com/llvm/llvm-project/pull/108791
More information about the llvm-commits
mailing list