[llvm] [AMDGPU] Introduce "amdgpu-uniform-intrinsic-combine" pass to combine uniform AMDGPU lane Intrinsics. (PR #116953)

Pankaj Dwivedi via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 11 02:50:53 PDT 2025


PankajDwivedi-25 wrote:

> What I would suggest to solve this is to systematically split the pass into two halves so that all UniformityInfo queries happen before all changes to the IR. The first half iterates over intrinsic instructions, checking UniformityInfo and building a list of all the instructions that can be transformed, and the second half simply goes over the list and applies the changes.


I don't see any benefit of splitting this pass into two halves as mentioned, since still these two halves combined will do the same job what we have in the pass right now, I feel that will add more overhead in terms of memory to store all those uniform values which are already stored in UI, and iterating on it will cost extra compilation time.

If I have understood the comment @ssahasra  can you please confirm this? 

Also, I have added a brief description of the proof in the pass.


https://github.com/llvm/llvm-project/pull/116953


More information about the llvm-commits mailing list