[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 31 15:58:34 PDT 2024
https://github.com/arsenm requested changes to this pull request.
We do not want or need a new pass to handle this. This is not a fix to the structural issue of wavesize. The problem is there is no such thing as a "no wavesize" IR. There is only wave32 or wave64. Querying the target gives the wrong answer for faux "generic" IR. Throwing in a pass that happens to know where it runs in the pipeline to decide when to lower is not a real fix; that is not a modular IR.
The correct solution is to use separate wave32 and wave64 builds. InstCombine can then just directly fold the intrinsic based on the known target.
https://github.com/llvm/llvm-project/pull/114481
More information about the llvm-commits
mailing list