[llvm] [VPlan] Fold safe divisors into VP intrinsics with EVL (PR #148828)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 16 10:03:37 PDT 2025
================
@@ -2176,6 +2176,52 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
.Default([&](VPRecipeBase *R) { return nullptr; });
}
+/// Try to optimize safe divisors away by converting their users to VP
+/// intrinsics:
+///
+/// udiv x, (vp.merge allones, y, 1, evl) -> vp.udiv x, y, allones, evl
+///
+/// Note the lanes past EVL will be changed from x to poison. This only works
+/// for the EVL-based IV and not any arbitrary EVL, because we know nothing
+/// will read the lanes past the EVL-based IV.
----------------
lukel97 wrote:
The users of the op aren't predicated in the sense that they're not converted to VPWidenIntrinsic VP intrinsic recipes, nor are they predicated in terms of `LoopVectorizationCostModel::isPredicatedInst`.
I guess the point this comment is trying to clarify is that there's an invariant in tail folding that for any recipe, none of the inactive lanes/lanes past EVL will be used, which is what this transform relies on to be correct.
I think this is similar to how we can't use regular ExtractLastElement with tail folding, and we need https://github.com/llvm/llvm-project/pull/149042 to make sure we only access the last active lane.
The EVL-based IV bit stems from the fact that we can't fold for e.g. `udiv x, (vp.merge allones, y, 1, foo) -> vp.udiv x, y, allones, foo` because we don't know that the lanes past foo won't be read. But we can guarantee that for foo=EVL-based IV.
https://github.com/llvm/llvm-project/pull/148828
More information about the llvm-commits
mailing list