[PATCH] D156112: [AArch64][LoopVectorize] Improve tail-folding heuristic on neoverse-v1

Igor Kirillov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 24 05:28:28 PDT 2023


igor.kirillov created this revision.
Herald added subscribers: artagnon, mgabka, shiva0217, hiraditya, kristof.beyls.
Herald added a project: All.
igor.kirillov requested review of this revision.
Herald added subscribers: llvm-commits, wangpc.
Herald added a project: LLVM.

Increases the number of instructions predication tail-folding threshold
when the loop has more than one comparison. The reason for this is that
the "whileXX" and vector comparison instructions have a throughput of
only one on that CPU, and if there is not enough computation between
the comparisons, it can cause the code to slow down.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D156112

Files:
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp


Index: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -3774,12 +3774,19 @@
   // Don't tail-fold for tight loops where we would be better off interleaving
   // with an unpredicated loop.
   unsigned NumInsns = 0;
+  unsigned NumComparisons = 0;
   for (BasicBlock *BB : TFI->LVL->getLoop()->blocks()) {
     NumInsns += BB->sizeWithoutDebug();
+    NumComparisons += count_if(
+        *BB, [](Instruction &I) { return isa<CmpInst, FCmpInst>(&I); });
   }
 
   // We expect 4 of these to be a IV PHI, IV add, IV compare and branch.
-  return NumInsns >= SVETailFoldInsnThreshold;
+  // If there is more than one comparison in the loop, increase the required
+  // number of instructions for predicated tail folding. This is because the
+  // throughput of comparison and `whileXX` instructions is only one, and
+  // insufficient computation between comparisons can slow down the code.
+  return NumInsns >= SVETailFoldInsnThreshold * NumComparisons;
 }
 
 InstructionCost


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D156112.543485.patch
Type: text/x-patch
Size: 1171 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230724/c5260f68/attachment.bin>


More information about the llvm-commits mailing list