[llvm] [VPlan] Extract reverse operation for reverse accesses (PR #146525)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 1 09:02:30 PST 2025
================
@@ -2866,28 +2867,42 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
TypeInfo.inferScalarType(MaxEVL), DebugLoc::getUnknown());
Builder.setInsertPoint(Header, Header->getFirstNonPhi());
- VPValue *PrevEVL = Builder.createScalarPhi(
- {MaxEVL, &EVL}, DebugLoc::getUnknown(), "prev.evl");
-
- for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
- vp_depth_first_deep(Plan.getVectorLoopRegion()->getEntry()))) {
- for (VPRecipeBase &R : *VPBB) {
- VPValue *V1, *V2;
- if (!match(&R,
- m_VPInstruction<VPInstruction::FirstOrderRecurrenceSplice>(
- m_VPValue(V1), m_VPValue(V2))))
- continue;
+ PrevEVL = Builder.createScalarPhi({MaxEVL, &EVL}, DebugLoc::getUnknown(),
+ "prev.evl");
+ }
+
+ // Transform the recipes must be converted to vector predication intrinsics
+ // even if they do not use header mask.
+ for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(
+ vp_depth_first_deep(Plan.getVectorLoopRegion()->getEntry()))) {
+ for (VPRecipeBase &R : *VPBB) {
+ VPWidenIntrinsicRecipe *NewRecipe = nullptr;
+ VPValue *V1, *V2;
+ if (match(&R, m_VPInstruction<VPInstruction::FirstOrderRecurrenceSplice>(
+ m_VPValue(V1), m_VPValue(V2)))) {
VPValue *Imm = Plan.getOrAddLiveIn(
ConstantInt::getSigned(Type::getInt32Ty(Plan.getContext()), -1));
- VPWidenIntrinsicRecipe *VPSplice = new VPWidenIntrinsicRecipe(
+ NewRecipe = new VPWidenIntrinsicRecipe(
Intrinsic::experimental_vp_splice,
{V1, V2, Imm, Plan.getTrue(), PrevEVL, &EVL},
TypeInfo.inferScalarType(R.getVPSingleValue()), {}, {},
R.getDebugLoc());
- VPSplice->insertBefore(&R);
- R.getVPSingleValue()->replaceAllUsesWith(VPSplice);
- ToErase.push_back(&R);
}
+
+ // TODO: Only convert reverse to vp.reverse if it uses the result of
+ // vp.load, or defines the stored value of vp.store.
----------------
lukel97 wrote:
> We could convert reverse accesses into Splice(VPWidenLoadEVLRecipe(VecEndPtr(ptr, evl)), poison, -evl) inside optimizeMaskToEVLRecipes, and rely on the regular reverse rather than vp.reverse.
Yup, this is what I had in mind: https://github.com/llvm/llvm-project/commit/32504676f616a98d3282ef2601550e6ed3e25714
This approach is safer and easier to reason about since the semantics of the VPlan never change.
> The only concern is that, if we introduce simplification rules that can eliminate the reverse, there will be a temporary performance regression because the reverse access might not be lowered into VPWidenLoadEVLRecipe/VPWidenStoreEVLRecipe. However, correctness should not be affected.
I can work on generalising the transform to be in terms of splices and not reverses to avoid the regression when the reverses are eliminated. I don't think that should block this PR, I'm happy to iterate on this in tree.
Btw I've posted an [RFC to relax the requirements on the splice intrinsic](https://discourse.llvm.org/t/rfc-allow-non-constant-offsets-in-llvm-vector-splice/88974/3), I will try to push that through.
https://github.com/llvm/llvm-project/pull/146525
More information about the llvm-commits
mailing list