[llvm] [VPlan] Extract reverse operation for reverse accesses (PR #146525)

Mon Sep 15 03:13:18 PDT 2025

================
@@ -2623,6 +2646,34 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
       }
     }
     ToErase.push_back(CurRecipe);
+
+    // Convert general reverse operations on loaded result into vp.reverse, when
+    // the VPVectorEndPointerRecipe adjusting the access address uses EVL
+    // instead of VF.
+    if (auto *LoadR = dyn_cast<VPWidenLoadEVLRecipe>(EVLRecipe)) {
----------------
Mel-Chen wrote:

> Is there a reason we handle the load/store cases separately, instead of just converting all reverse operations? 

This is the conclusion from @lukel97 and me: we should  convert only the recipes that are actually masked by the header mask, rather than converting all recipes into EVL recipes.
For now, converting all reverses in the vectorized loop into vp.reverse has the same effect as converting along the DU/UD chain from VPWidenLoadEVLRecipe/VPWidenStoreEVLRecipe, because this is the only usage scenario currently.

> Could we mis-compile in the future if some other transform decides to create new reverse operations?

The conversion is based on the fact that masked reverse accesses use `reverse(header mask)` or `reverse(header mask and mask)` as the mask. A reversed header mask causes non-active lanes to go to the head, while VP intrinsics with EVL can only mask out non-active lanes at the tail, not at the head. That’s why we need a series of transformations: `VPVectorEndPointer(ptr, VF)` to `VPVectorEndPointer(ptr, EVL)` and reverse to vp.reverse.

If a new recipe is also masked by `reverse(header mask)` or `reverse(header mask and mask)`, then converting the new recipe into an EVL recipe also requires converting the reverse into vp.reverse. But if it isn’t masked, there’s no need to convert it into vp.reverse.


https://github.com/llvm/llvm-project/pull/146525