[llvm] [LoopUnroll] Add CSE to remove redundant loads after unrolling. (PR #83860)

Florian Hahn via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 24 06:51:06 PDT 2024


https://github.com/fhahn commented:

> The general approach here seems okay to me. The late unroll pass is awkward because it's at the very end of the pipeline, and it's not really feasible to do much optimization after it. The targeted approach here looks fine in that sense, and doesn't seem to add that much additional code complexity. It's not something that will scale though -- there are more and more optimization we could in theory perform after late unrolling, but we can't reasonably keep replicating them in the post-unroll simplification.
> 

Agreed, if more cleanups would be beneficial, we likely need to get back to the drawing board.

> Are you able to share what non-benchmark workload was the original motivation for this?

Unfortunately the code is proprietary so I can't share the code, but it is an implementation of Deeplabv3 (https://arxiv.org/abs/1706.05587) tuned for AArch64 and used to blur video backgrounds. The regression is coming from loops with vector intrinsics + unroll pragmas. 

https://github.com/llvm/llvm-project/pull/83860


More information about the llvm-commits mailing list