[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

Matt Arsenault via cfe-commits cfe-commits at lists.llvm.org
Thu Oct 31 15:58:34 PDT 2024


https://github.com/arsenm requested changes to this pull request.

We do not want or need a new pass to handle this. This is not a fix to the structural issue of wavesize. The problem is there is no such thing as a "no wavesize" IR. There is only wave32 or wave64. Querying the target gives the wrong answer for faux "generic" IR. Throwing in a pass that happens to know where it runs in the pipeline to decide when to lower is not a real fix; that is not a modular IR.

The correct solution is to use separate wave32 and wave64 builds. InstCombine can then just directly fold the intrinsic based on the known target.

https://github.com/llvm/llvm-project/pull/114481


More information about the cfe-commits mailing list