[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

Thu Oct 31 17:59:50 PDT 2024

AlexVlx wrote:

> > Faux "generic" IR sounds like a problematic concept, do you have an example?
> 
> It's what `libc` and the ROCm DeviceLibs do, compile or IR without `-mcpu` and don't use any target specific attributes or intrinsics, then link it into a TU later when the target is known. It's fine in principle if you hold it right, but the wavefrontsize is the one sticking issue, hence why Matt would suggest having two builds of `libc`, one for `amdgcn-amd-amdhsa-wave32` and `amdgcn-amd-amdhsa-wave64` or something.

As per my other reply, this is not an invalid use case, but somewhat niche. We can have a control value for disabling this early fold, for such builds, to avoid the need to do two builds (which might also be fine for `libc`). I don't think ROCDL uses the intrinsic at all.

https://github.com/llvm/llvm-project/pull/114481