[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

Mon Nov 18 09:50:43 PST 2024

jhuber6 wrote:

> I don't really understand why cluster users are compiling on a system where the GPUs are being stressed, and I still don't see why it's a good idea to break layering for this case.

I'd like to eliminate a class of failures I've seen with `--offload-arch=native` either causing the compilation job to fail, hang forever, or take a really long time. I suppose I cannot attest to whether or not the Driver team will modify this interface, but since it's hard-coded in ROCT I would doubt it. This is also sufficiently tested by `libc` which uses `-mcpu=native` and has its own bot, along with some downstream OpenMP tests using `--offload-arch=native` so I don't think this would silently fail and make it into a release.

> Also, I wasn't aware that the "native" offload arch is supported by ROCm.

HIP does at least upstream, don't know if that was modified in ROCm.

https://github.com/llvm/llvm-project/pull/116651