[PATCH] D135991: [AArch64] Fix cost model for `udiv` instruction when one of the operands is a uniform constant

Zain Jaffal via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 31 02:42:47 PDT 2022


zjaffal marked 5 inline comments as done.
zjaffal added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2109
+            (Op2Info.isConstant() && Op2Info.isUniform())) {
+          InstructionCost InsertCost = getArithmeticInstrCost(
+              Instruction::InsertElement, Ty, CostKind, Op1Info, Op2Info);
----------------
dmgreen wrote:
> getArithmeticInstrCost should not be used with InsertElement or ExtractElement. It is used for arithmetic instructions like Add and Div. getVectorInstrCost should be used with InsertElement and ExtractElement.
> 
> For the i64 Mul case below we hard-coded the values 2 instead, due to the more regular nature of the scalarization. (It was originally 1, but we had to increase it as it was not quite high enough in cases).
The code before my patch used `getArithmeticInstrCost` should we only use `getVectorInstrCost` then?


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2116
+                            << " Extract Cost: " << ExtractCost << "\n");
+          return InsertCost + ExtractCost + VTy->getNumElements();
+        }
----------------
dmgreen wrote:
> This is assuming that the cost of the actual divide is 1? That sounds too low in some cases, compared to scalar. Should it be `(InsertCost + ExtractCost + Cost) * VTy->getNumElements()`? 
The problem is that cost here counts the insert and extract cost so we will end up counting twice. 
```
    InstructionCost Cost = BaseT::getArithmeticInstrCost(
        Opcode, Ty, CostKind, Op1Info, Op2Info);
```
Takes into account the scalarization overhead. For the following example

```
define <8 x i32> @foo(<8 x i32> %v) {
  %res = udiv <8 x i32> %v, <i32 255, i32 255, i32 255, i32 255, i32 255, i32 255, i32 255, i32 255>
  ret <8 x i32> %res
}
```
The cost was 44. I don't think it is correct


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135991/new/

https://reviews.llvm.org/D135991



More information about the llvm-commits mailing list