[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

Thu Oct 31 19:52:29 PDT 2024

arsenm wrote:

Mechanically, this pass can be replaced with trivial handling of the intrinsic in AMDGPUInstCombineIntrinsic; we don't need a new module pass. As inserted into the pipeline here, this does not have any advantage over handling it directly in instcombine.

> We could just turn this off for a particular compilation and maintain the current unfoldable state.

This violates the fundamental principles of a modular compiler IR. Any mechanism which we would have to invent to stop this fold from happening in a specific bitcode library build will be quite unsavory, and require handholding of every user to not run into the same issue. I'd like to systematically avoid this class problem by having a separate library build.

> but apparently +wavefrontsize32 on the function isn't enough as per Matt's reply.

The toothpaste is out of the tube once the IR is produced. If some toolchain were relying on the global target machine features, there are opportunities for error on each tool invocation. The absence of the attribute does not tell you what the final compilation context will be. 

https://github.com/llvm/llvm-project/pull/114481