[llvm] [LoopUnroll] Penalize interior control flow (PR #67137)

Mon Sep 25 08:16:16 PDT 2023

nikic wrote:

> Do you have any data on how widespread the impact of this is ?

Here is the code size impact on CTMark: http://llvm-compile-time-tracker.com/compare.php?from=1c9b63f1038d13be7d83015565e2ecc395964c04&to=46f453b2872f5c31de0a9f6c4e35cbe52765a608&stat=instructions:u And here are the corresponding IR diffs: https://github.com/nikic/llvm-ir-diffs/commit/badc90e203398680143b0ddf6e97a2571183dcf1

> Penalizing loops with internal control flow makes sense to me in general, unless unrolling allows the branches to be folded. Should this be integrated with `UnrolledInstAnalyzer`? I haven't looked at that in a while though so I am not sure if it actually can correctly estimate such things.

Yeah, I think the "unless unrolling allows the branches to be folded" is the key bit. This is not something we capture right now. The problem here is that we have two modes of analyzing unrolling: analyzeLoopUnrollCost() which is used for full unrolling up to 10 iterations, is very precise, but also very expensive (#Insts * #Iterations) and then we have ApproximateLoopSize() for everything else, which is completely crude and does not reason about simplifications at all.

I'm considering making ApproximateLoopSize() smarter first, by determining which instructions we expect to constant fold after full unrolling without any per-iteration analysis (basically if we have simple recurrence with constant start and constant step, then any side-effect free instruction depending only on it will constant fold). This would make estimates for full unrolling more accurate and would allow us to not apply the branch penalty to branches that will fold away. I'm somewhat afraid this will back-fire though, because it will increase the amount of unrolling if thresholds remain the same (and, generally speaking, we are at a point where *average* unrolling needs to decrease rather than increase for better performance).

https://github.com/llvm/llvm-project/pull/67137