[PATCH] D105432: [Analysis] Add simple cost model for strict (in-order) reductions

Tue Jul 6 03:29:22 PDT 2021

dmgreen added a comment.

Can we add cost model tests too? There is currently a fixme for `// FIXME: Add new flag for cost of strict reductions.`

Is it better to add a new method, or to add a flag to the existing methods and base the decision on the fast math flags? Any fp reduction, not just from the vectorizer, may be strict-inorder if the fmf's on it do not allow reassoc. Similarly for fmin/fmax where nnan/ninf/reassoc may make a difference.

================
Comment at: llvm/include/llvm/CodeGen/BasicTTIImpl.h:2087
+                                   TTI::TargetCostKind CostKind) {
+    auto *VTy = cast<FixedVectorType>(Ty);
+    InstructionCost Cost =
----------------
Should this be checking for scalable vectors?

================
Comment at: llvm/include/llvm/CodeGen/BasicTTIImpl.h:2089
+    InstructionCost Cost =
+        getArithmeticInstrCost(Opcode, VTy->getElementType(), CostKind);
+    return Cost * VTy->getNumElements();
----------------
I think this should this also include the cost of extracts (which will sometimes be free, but that is target specific). Probably using the ScalarizationOverhead.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h:136-137
+  /// when scalarizing an operation for a vector with ElementCount \p VF.
+  /// For scalable vectors this currently takes the most pessimistic view based
+  /// upon the maximum possible value for vscale.
+  unsigned getScalarizationCostFactor(ElementCount VF) const {
----------------
I had assumed (without thinking about it very much) that the costs for VF arguments would be based either the exact value of VF from the -mcpu argument if one is provided. If it is not then we would guess at a value, probably VF=2 would be a sensible default. This is using the maximum possible VF, which sounds like a large over-estimate in most cases.

Can you speak to why the max VF is a better value to use?

I'm not sure I understand why this is scalarizing though.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105432/new/

https://reviews.llvm.org/D105432