[llvm] [LoopUnroll] Structural cost savings analysis for full loop unrolling (PR #114579)
Lucas Ramirez via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 19 08:02:31 PST 2024
lucas-rami wrote:
Thanks a lot for adding context here, I see the danger of catastrophic unrolling and I now see that my patch will make it worse. My original motivation for this patch was in fact an AMDGPU kernel structurally similar to my motivating example where all loops need to be fully unrolled to achieve good performance (see discourse post [here](https://discourse.llvm.org/t/unexpected-peeling-decision-when-trying-to-eliminate-compares-inside-loop/82866/3) as well).
I think a solution would be, if there are subloops, to severely limit (or even nullify) the cost savings my analysis estimates if these subloops are too big or have big trip counts (what "too big" means to be defined somehow). If the latter cannot be determined, we would have to err on the "do not unroll" side. Do you think this would prevent catastrophic unrolling and still make the analysis a somewhat useful optimization?
https://github.com/llvm/llvm-project/pull/114579
More information about the llvm-commits
mailing list