[llvm] [AArch64] Enable RT and partial unrolling with reductions for Apple CPUs. (PR #149699)

Eli Friedman via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 21 10:04:24 PDT 2025


================
@@ -4787,6 +4788,22 @@ getAppleRuntimeUnrollPreferences(Loop *L, ScalarEvolution &SE,
   if (!L->getExitBlock())
     return;
 
+  // Check if the loop contains any reductions that could be parallelized when
+  // unrolling. If so, enable partial unrolling, if the trip count is know to be
+  // a multiple of 2.
+  bool HasParellelizableReductions =
+      L->getNumBlocks() == 1 &&
+      any_of(L->getHeader()->phis(),
+             [&SE, L](PHINode &Phi) {
+               return canParallelizeReductionWhenUnrolling(Phi, L, &SE);
+             }) &&
+      isLoopSizeWithinBudget(L, TTI, 12, nullptr);
----------------
efriedma-quic wrote:

Can we somehow get the unroller itself to compute whether it will actually generate a parallelized reduction, and pass that down to getUnrollingPreferences, instead of trying to recompute it in target-specific code?

https://github.com/llvm/llvm-project/pull/149699


More information about the llvm-commits mailing list