[llvm] [AArch64][CostModel] Improve cost estimate of scalarizing a vector di… (PR #118055)

Alexey Bataev via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 19 09:26:03 PST 2024


================
@@ -3472,6 +3472,20 @@ InstructionCost AArch64TTIImpl::getArithmeticInstrCost(
           Cost *= 4;
         return Cost;
       } else {
+        // If the information about individual scalars being vectorized is
+        // available, this yeilds better cost estimation.
+        if (auto *VTy = dyn_cast<FixedVectorType>(Ty); VTy && !Args.empty()) {
+          InstructionCost InsertExtractCost =
+              ST->getVectorInsertExtractBaseCost();
+          Cost = (3 * InsertExtractCost) * VTy->getNumElements();
+          for (int i = 0, Sz = Args.size(); i < Sz; i += 2) {
+            Cost += getArithmeticInstrCost(
+                Opcode, VTy->getScalarType(), CostKind,
+                TTI::getOperandInfo(Args[i]), TTI::getOperandInfo(Args[i + 1]));
+          }
+          return Cost;
+        }
----------------
alexey-bataev wrote:

> > 1. That's fine. If you think they should be vectorized, a new vectorization strategy should be implemented instead.
> 
> Don't you think lowering the division vector in some late IR pass is easy than stopping vectorization at div ? One more advantage of lowering later is lowering is transparent to the user and done at single place rather than at multiple places. This type of vector lowering is already being done at other places e.g. masked_load
> 

It just breaks the vectorizer design. If the node should be scalarized, the vectorization model should know about it and scalarize it. Otherwise, it complicates the whole vectorization process and makes it miss some possible vectorization opportunities. I assume, you missed the cost of extraction of the operands in your cost model changes?

> > 2. No, it is just a simple legality check. If the instruction should be scalarized, it should be built into a build vector node; it should not be vectorized.
> 
> Just need a clarification: my understanding of `TreeEntry::NeedToGather` is you build a vector out of scalars. Is that correct? Now, coming to the legality checks. Yes, in the end, its legality check but I need to check all the contraints, which are currently in `getArithmeticInstrCost` once again. And I am trying to avoid that route just for reasons mentioned in my comment on (1).

You can check these constraints before and command vectorizer to build a NeedToGather node. You should not use getArithmeticInstrCost in this case, some other function should check if it is legal to vectorize these scalars here


https://github.com/llvm/llvm-project/pull/118055


More information about the llvm-commits mailing list