[llvm] [LoopVectorizer] Add support for partial reductions (PR #92418)

Fri Dec 6 05:47:34 PST 2024

================
@@ -291,6 +291,68 @@ InstructionCost VPRecipeBase::computeCost(ElementCount VF,
   llvm_unreachable("subclasses should implement computeCost");
 }
 
+InstructionCost
+VPPartialReductionRecipe::computeCost(ElementCount VF,
+                                      VPCostContext &Ctx) const {
+  std::optional<unsigned> Opcode = std::nullopt;
+  VPRecipeBase *BinOpR = getOperand(0)->getDefiningRecipe();
+  if (auto *WidenR = dyn_cast<VPWidenRecipe>(BinOpR))
+    Opcode = std::make_optional(WidenR->getOpcode());
+
+  VPRecipeBase *ExtAR = BinOpR->getOperand(0)->getDefiningRecipe();
+  VPRecipeBase *ExtBR = BinOpR->getOperand(1)->getDefiningRecipe();
+
+  auto GetExtendKind = [](VPRecipeBase *R) {
+    auto *WidenCastR = dyn_cast<VPWidenCastRecipe>(R);
+    if (!WidenCastR)
+      return TargetTransformInfo::PR_None;
+    if (WidenCastR->getOpcode() == Instruction::CastOps::ZExt)
+      return TargetTransformInfo::PR_ZeroExtend;
+    if (WidenCastR->getOpcode() == Instruction::CastOps::SExt)
+      return TargetTransformInfo::PR_SignExtend;
+    return TargetTransformInfo::PR_None;
+  };
+
+  auto *PhiType = Ctx.Types.inferScalarType(getOperand(1));
+  auto *ExtTy = Ctx.Types.inferScalarType(ExtAR->getOperand(0));
+
+  return Ctx.TTI.getPartialReductionCost(
+      getUnderlyingInstr()->getOpcode(), ExtTy, PhiType, VF,
+      GetExtendKind(ExtAR), GetExtendKind(ExtBR), Opcode);
+}
+
+void VPPartialReductionRecipe::execute(VPTransformState &State) {
+  State.setDebugLocFrom(getDebugLoc());
+  auto &Builder = State.Builder;
+
+  assert(getUnderlyingInstr()->getOpcode() == Instruction::Add &&
----------------
fhahn wrote:

Both are independent, setting the instruction as underlying value is fine, as it improves printing (using the IR name),  and preferred over storing the instruction in a separate field. 

But if possible it should be optional (i.e. only used for printing). The problem with requiring the underlying instruction is that it prevents VPlan transformations from creating the recipe without underlying IR instructions.

I might not have been able to find the comment you were referring to originally, but https://github.com/llvm/llvm-project/pull/92418#discussion_r1848753925 suggested to use the constructor that sets the underlying value, but not to drop `Opcode`.

https://github.com/llvm/llvm-project/pull/92418