[llvm] [LoopVectorizer] Add support for partial reductions (PR #92418)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 16 04:04:38 PST 2024
================
@@ -342,6 +343,61 @@ class AArch64TTIImpl : public BasicTTIImplBase<AArch64TTIImpl> {
return BaseT::isLegalNTLoad(DataType, Alignment);
}
+ InstructionCost
+ getPartialReductionCost(unsigned Opcode, Type *InputType, Type *AccumType,
+ ElementCount VF,
+ TTI::PartialReductionExtendKind OpAExtend,
+ TTI::PartialReductionExtendKind OpBExtend,
+ std::optional<unsigned> BinOp) const {
+
+ InstructionCost Invalid = InstructionCost::getInvalid();
+ InstructionCost Cost(TTI::TCC_Basic);
+
+ if (Opcode != Instruction::Add)
+ return Invalid;
+
+ EVT InputEVT = EVT::getEVT(InputType);
+ EVT AccumEVT = EVT::getEVT(AccumType);
+
+ if (VF.isScalable() && !ST->isSVEorStreamingSVEAvailable())
+ return Invalid;
+ if (VF.isFixed() && !ST->isNeonAvailable() && !ST->hasDotProd())
----------------
david-arm wrote:
I think this logic is wrong, and it should be
```
if (VF.isFixed() && (!ST->isNeonAvailable() || !ST->hasDotProd()))
return Invalid;
```
i.e. if the VF is fixed-width then return Invalid if *either* of these is true:
1. We're in streaming mode and NEON is not available, or
2. We're using an older architecture that doesn't have dot product.
https://github.com/llvm/llvm-project/pull/92418
More information about the llvm-commits
mailing list