[clang] [compiler-rt] [llvm] [PGO][AMDGPU] Add offload profiling with uniformity-aware optimization (PR #177665)

Joseph Huber via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 18 06:12:06 PDT 2026


jhuber6 wrote:

> Updated to use library calls (`__gpu_pgo_is_sampled`, `__gpu_pgo_increment` in `InstrProfilingGPU.c`) for the GPU instrumentation instead of inline IR, building on Joseph's profile library infrastructure.

I have the basic form of this in https://github.com/llvm/llvm-project/pull/185763, but without the uniformity checks. For copying to the HIP runtime, I have https://github.com/llvm/llvm-project/pull/187136 which should hopefully simplify that. And https://github.com/llvm/llvm-project/pull/185761 should cause `libclang_rt.profile.a` to be automatically included if present on the offloading compilation.

https://github.com/llvm/llvm-project/pull/177665


More information about the llvm-commits mailing list