[llvm] [LLVM][AMDGPU] AMDGPUInstCombineIntrinsic for *lane intrinsics (PR #99878)

Mon Jul 22 08:01:40 PDT 2024

nikic wrote:

> > Compile-time: http://llvm-compile-time-tracker.com/compare.php?from=a7fb25dd1fcc2e5afcc65cccfa83b7b381b48906&to=18cda5b39d709a861cea69e690d674c4b49789b4&stat=instructions:u
> 
> I wonder if the slow down is due to running CycleAnalysis, and whether we could somehow avoid that on targets that don't care about divergence.

Very likely. And yes, you should be able to easily avoid that in the new pass manager by delaying the analysis fetch until after the hasBranchDivergence() check.

https://github.com/llvm/llvm-project/pull/99878