[PATCH] D155804: [LV] Cache call vectorization decisions

Mon Jul 31 05:20:41 PDT 2023

david-arm added a comment.

Hi @huntergr this looks like a good clean up! I've not reviewed `setVectorizedCallDecision`, but I do have a few comments so far ...

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1250

+  void setVectorizedCallDecision(ElementCount VF);
+
----------------
Can you add comments here explaining what the function does?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1433
+                               InstructionCost Cost) {
+    assert(VF.isVector() && "Expected VF >= 2");
+    CallWideningDecisions[std::make_pair(CI, VF)] = {Kind, Variant, IID,
----------------
nit: This assert comment isn't quite right because `isVector` returns true for VF=vscale x 1. Same for the assert below.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3483
+  // we should already have a pre-calculated cost at each VF.
+  if (VF.isScalar()) {
+    TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
----------------
To avoid indenting so much code can you rewrite this as:

  if (VF.isVector())
    return CallWideningDecisions.at(std::make_pair(CI, VF)).Cost;

  ... rest of unindented scalar code ...

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6669
   // the reduction on its own.
-  Instruction *LastChain = InLoopReductionImmediateChains[RetI];
+  Instruction *LastChain = InLoopReductionImmediateChains.at(RetI);
   Instruction *ReductionPhi = LastChain;
----------------
Why are these changes needed?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll:9

-; CHECK-VF4: Found an estimated cost of 1 for VF 4 For instruction:   %add = fadd float %0, %sum.07
-; CHECK-VF8: Found an estimated cost of 2 for VF 8 For instruction:   %add = fadd float %0, %sum.07
+; CHECK-VF4: Found an estimated cost of 17 for VF 4 For instruction:   %add = fadd float %0, %sum.07
+; CHECK-VF8: Found an estimated cost of 34 for VF 8 For instruction:   %add = fadd float %0, %sum.07
----------------
Can this cost model fix be done in a separate patch so that it's not tied to the vector call changes? It also means this patch can become more like a NFC patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155804/new/

https://reviews.llvm.org/D155804