[llvm] [LoopVectorizer] Add support for partial reductions (PR #92418)
Sam Tebbs via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 16 08:18:34 PDT 2024
================
@@ -341,6 +341,43 @@ class AArch64TTIImpl : public BasicTTIImplBase<AArch64TTIImpl> {
return BaseT::isLegalNTLoad(DataType, Alignment);
}
+ InstructionCost getPartialReductionCost(unsigned Opcode, Type *InputType,
+ Type *AccumType, ElementCount VF,
+ PartialReductionExtendKind OpAExtend,
+ PartialReductionExtendKind OpBExtend,
+ std::optional<unsigned> BinOp) const {
+ InstructionCost Cost = InstructionCost::getInvalid();
+
+ if (Opcode != Instruction::Add)
+ return Cost;
+
+ EVT InputEVT = EVT::getEVT(InputType);
+ EVT AccumEVT = EVT::getEVT(AccumType);
+
+ if (AccumEVT.isScalableVector() && !ST->isSVEorStreamingSVEAvailable())
+ return Cost;
+ if (!AccumEVT.isScalableVector() && !ST->isNeonAvailable() &&
+ !ST->hasDotProd())
+ return Cost;
+
+ if (InputEVT == MVT::i8) {
+ if (AccumEVT != MVT::i32)
----------------
SamTebbs33 wrote:
I think you've actually uncovered a flaw in my code here. It looks like `InputType` and `AccumType` are always scalar and so the checks above aren't doing anything. I'll figure out a way to tell `getPartialReductionCost` if we're dealing with scalable or fixed-length vectors so we can check for SVE and Neon properly.
https://github.com/llvm/llvm-project/pull/92418
More information about the llvm-commits
mailing list