[llvm] [AArch64] Enable RT and partial unrolling with reductions for Apple CPUs. (PR #149699)
Florian Hahn via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 22 05:26:17 PDT 2025
================
@@ -4787,6 +4788,22 @@ getAppleRuntimeUnrollPreferences(Loop *L, ScalarEvolution &SE,
if (!L->getExitBlock())
return;
+ // Check if the loop contains any reductions that could be parallelized when
+ // unrolling. If so, enable partial unrolling, if the trip count is know to be
+ // a multiple of 2.
+ bool HasParellelizableReductions =
+ L->getNumBlocks() == 1 &&
+ any_of(L->getHeader()->phis(),
+ [&SE, L](PHINode &Phi) {
+ return canParallelizeReductionWhenUnrolling(Phi, L, &SE);
+ }) &&
+ isLoopSizeWithinBudget(L, TTI, 12, nullptr);
----------------
fhahn wrote:
I had a look, but couldn't really find a good way.
We could add parallelizable reductions to the unrolling preferences, and detect them before handing off to TTI. But then we would always need to perform the reduction detection, which would come with some compile-time cost.
Perhaps there is a different alternative?
https://github.com/llvm/llvm-project/pull/149699
More information about the llvm-commits
mailing list