[llvm] [VPlan] Get Addr computation cost with scalar type if it is uniform for gather/scatter. (PR #150371)

Elvis Wang via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 11 20:56:28 PDT 2025


https://github.com/ElvisWang123 updated https://github.com/llvm/llvm-project/pull/150371

>From ee2b59094c5f0fb9ee6f8620cb6a2f0e0c33f2c6 Mon Sep 17 00:00:00 2001
From: Elvis Wang <elvis.wang at sifive.com>
Date: Wed, 23 Jul 2025 17:02:40 -0700
Subject: [PATCH] [VPlan] Get Addr computation cost with scalar type if it is
 uniform for gather/scatter.

This patch query `getAddressComputationCost()` with scalar type if the
address is uniform. This can help the cost for gather/scatter more
accurate.

In current LV, non consecutive VPWidenMemoryRecipe (gather/scatter) will
account the cost of address computation. But there are some cases that
the addr is uniform accross lanes, that makes the address can be
calculated with scalar type and broadcast.

I have a follow optimization that try to converts gather/scatter with
uniform memory acces to scalar load/store + broadcast. With this
optimization, we can remove this temporary change.
---
 llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index e34cab117f321..0cf7a425fa079 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -3106,10 +3106,17 @@ InstructionCost VPWidenMemoryRecipe::computeCost(ElementCount VF,
     // TODO: Using the original IR may not be accurate.
     // Currently, ARM will use the underlying IR to calculate gather/scatter
     // instruction cost.
-    const Value *Ptr = getLoadStorePointerOperand(&Ingredient);
-    Type *PtrTy = toVectorTy(Ptr->getType(), VF);
     assert(!Reverse &&
            "Inconsecutive memory access should not have the order.");
+
+    const Value *Ptr = getLoadStorePointerOperand(&Ingredient);
+    Type *PtrTy = Ptr->getType();
+
+    // If the address value is uniform across all lane, then the address can be
+    // calculated with scalar type and broacast.
+    if (!vputils::isSingleScalar(getAddr()))
+      PtrTy = toVectorTy(PtrTy, VF);
+
     return Ctx.TTI.getAddressComputationCost(PtrTy) +
            Ctx.TTI.getGatherScatterOpCost(Opcode, Ty, Ptr, IsMasked, Alignment,
                                           Ctx.CostKind, &Ingredient);



More information about the llvm-commits mailing list