[llvm] [LV][EVL] Emit vp.merge intrinsic to enable out-loop reduction in EVL vectorization. (PR #101641)
Mel Chen via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 29 06:16:47 PDT 2024
================
@@ -1392,7 +1395,19 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
TypeInfo.inferScalarType(Sel),
false, false, false);
})
-
+ .Case<VPInstruction>([&](VPInstruction *VPI) -> VPRecipeBase * {
+ VPValue *LHS, *RHS;
+ if (!match(VPI, m_Select(m_Specific(HeaderMask), m_VPValue(LHS),
----------------
Mel-Chen wrote:
I understand your concern. However, I don’t think we need to perform a check specifically for reductions. As long as the `VPInstruction::select` matches the form `select(HeaderMask, LHS, RHS)`, it is correct to convert it to `vp.merge(all-true, LHS, RHS, EVL)`, whether or not it’s a predicated reduction select.
The difference between `vp.merge` and `vp.select` is that for all result lanes at positions greater or equal than EVL, `vp.select` sets them as undefined, whereas `vp.merge` sets those lanes to `RHS` (the value on false). Thus, `VPWidenSelectRecipe` and `VPInstruction::select` **without** a header mask condition can be converted to `vp.select` because we don't care the results at inactive lanes, while `VPInstruction::select` **with** a header mask condition must be converted to `vp.merge`.
https://github.com/llvm/llvm-project/pull/101641
More information about the llvm-commits
mailing list