[llvm] [LoopVectorizer] Add support for partial reductions (PR #92418)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 16 04:04:38 PST 2024


================
@@ -342,6 +343,61 @@ class AArch64TTIImpl : public BasicTTIImplBase<AArch64TTIImpl> {
     return BaseT::isLegalNTLoad(DataType, Alignment);
   }
 
+  InstructionCost
+  getPartialReductionCost(unsigned Opcode, Type *InputType, Type *AccumType,
+                          ElementCount VF,
+                          TTI::PartialReductionExtendKind OpAExtend,
+                          TTI::PartialReductionExtendKind OpBExtend,
+                          std::optional<unsigned> BinOp) const {
+
+    InstructionCost Invalid = InstructionCost::getInvalid();
+    InstructionCost Cost(TTI::TCC_Basic);
+
+    if (Opcode != Instruction::Add)
+      return Invalid;
+
+    EVT InputEVT = EVT::getEVT(InputType);
+    EVT AccumEVT = EVT::getEVT(AccumType);
+
+    if (VF.isScalable() && !ST->isSVEorStreamingSVEAvailable())
+      return Invalid;
+    if (VF.isFixed() && !ST->isNeonAvailable() && !ST->hasDotProd())
----------------
david-arm wrote:

I think this logic is wrong, and it should be

```
  if (VF.isFixed() && (!ST->isNeonAvailable() || !ST->hasDotProd()))
    return Invalid;
```

i.e. if the VF is fixed-width then return Invalid if *either* of these is true:
1. We're in streaming mode and NEON is not available, or
2. We're using an older architecture that doesn't have dot product.


https://github.com/llvm/llvm-project/pull/92418


More information about the llvm-commits mailing list