[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

Mon Jan 27 17:00:34 PST 2025

jhuber6 wrote:

> It is six months old, but it is relatively difficult to keep track of all the concurrent non-communicating swimlanes - I do apologise for not having done so though! It is possible that we are microfocusing on a potential use case, but there are other valid ones. For example, ignoring the SPIR-V angle, one could emit IR from a HIP app compiled with -nogpulib. HIP headers forward various functions to ROCDL symbols, which means that the IR will include calls to ROCDL symbols. It is valid to then take said IR as input, and "finalise" it for AMDGPU, linking the device libs in the process. In this case you cannot disambiguate based on the source language / file extension. An example of an invocation that ends up here would be the following:
> 
> ```shell
> clang -O3 -x ir -v foo.bc -mcode-object-version=5 --save-temps -fcolor-diagnostics -fexceptions -fvisibility=hidden -fconvergent-functions -ffp-contract=fast-honor-pragmas -fno-experimental-relative-c++-abi-vtables -O3 -fno-autolink -finline-functions -o foo.co -target amdgcn-amd-amdhsa -mcpu=gfx1030
>```

Okay, so this case definitely tries to link the ROCm bitcode via `-mlink-builtin-bitcode` and this patch changes that. I would personally say that this shouldn't be the default behavior because the ROCm device libs are very language specific things and we shouldn't be encoding that into plain LLVM-IR. If someone does `-nogpulib` I don't think it's our responsibility to fix things up for them. People can always do `-Xclang -mlink-builtin-bitcode -Xclang /opt/rocm/amdgcn/bitcode/ockl.bc` even if it is extremely painful.

If there's an existing use-case that this breaks, I can revert it. Long term however it would be nice if the ROCm device libraries could be pulled into a more friendly library format. Then you could just pass it to `ld.lld` and everything's fine. Generally my understanding is that the ROCm DL's should be fixed up by the language front-end, so if anything makes it to the IR level the user did something non-standard, and I don't think that's our fault considering they can still do it manually if needed.

> In the future, it would also be somewhat helpful, if possible, to advertise what amounts to flipping a long-established default, even when said default is dubious.

I need to determine the optimal graph of AMD engineers to include in the reviewers list of patches like these.

https://github.com/llvm/llvm-project/pull/99687