[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

Thu Oct 31 17:41:42 PDT 2024

AlexVlx wrote:

> We do not want or need a new pass to handle this. This is not a fix to the structural issue of wavesize. The problem is there is no such thing as a "no wavesize" IR. There is only wave32 or wave64. Querying the target gives the wrong answer for faux "generic" IR. Throwing in a pass that happens to know where it runs in the pipeline to decide when to lower is not a real fix; that is not a modular IR.
> 
> The correct solution is to use separate wave32 and wave64 builds. InstCombine can then just directly fold the intrinsic based on the known ta

The new pass is not just to handle this, it happens to handle this since it already exists. Having an unspecified, abstract quantity is not the same thing as it being absent. Faux "generic" IR sounds like a problematic concept, do you have an example? Multi-builds might be the correct solution for something, but it's unclear what that something is - yes, if you already "fix" the wave size value, then the intrinsic is fairly spurious anyway, but it does not address the need to NOT encode it early. `InstCombine` might be an idea, but it runs a wee bit late.

https://github.com/llvm/llvm-project/pull/114481