[llvm-dev] Enabling Loop Distribution Pass as default in the pipeline of new pass manager
Sanne Wouda via llvm-dev
llvm-dev at lists.llvm.org
Fri Jun 25 03:23:11 PDT 2021
Do you have any data on how often LoopDistribute triggers on a larger set of programs (like llvm-test-suite + SPEC)? AFAIK the implementation is very limited at the moment (geared towards catching the case in hmmer) and I suspect lack of generality is one of the reasons why it is not enabled by default yet.
It would be good to have some fresh numbers on how often LoopDistribute triggers. From what I remember, there are a handful of cases in the test suite, but nothing that significantly affects performance (other than hmmer, obviously).
Also, there’s been an effort to improve the cost-modeling for LoopDistribute (https://reviews.llvm.org/D100381) Should we make progress in that direction first, before enabling by default?
Unfortunately, there were some problems with this effort. First, the current implementation of LoopDistribute relies heavily on LoopAccessAnalysis, which made it difficult to adapt.
More importantly though, I'm not convinced that LoopDistribute will be beneficial other than in cases where it enables more vectorization. (The memcpy detection gcc might be interesting, I didn't look at that.) It reduces both ILP and MLP, which in some cases might be made up by lower register or cache pressure, but this is hard or impossible for the compiler to know.
While working on this, with a more aggressive LoopDistribute across several benchmarks, I did not see any improvements that didn't turn out to be noise, and plenty of cases where it was actively degrading performance.
Therefore, I'm not sure this direction is worth pursuing further, and I believe the current heuristic of "distribute when it enables new vectorization" is actually pretty reasonable, if not very general.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev