[clang] Improve HIP docs on fat binary registration ordering (PR #168566)
Yaxun Liu via cfe-commits
cfe-commits at lists.llvm.org
Mon Nov 24 09:51:15 PST 2025
================
@@ -210,6 +210,88 @@ Host Code Compilation
- These relocatable objects are then linked together.
- Host code within a TU can call host functions and launch kernels from another TU.
+HIP Fat Binary Registration and Unregistration
+==============================================
+
+When compiling HIP for AMD GPUs, Clang embeds device code into HIP "fat
+binaries" and generates host-side helper functions that register these
+fat binaries with the HIP runtime at program start and unregister them at
+program exit. In non-RDC mode (``-fno-gpu-rdc``), each compilation unit
+typically produces its own self-contained fat binary per GPU architecture. In
+RDC mode (``-fgpu-rdc``), device bitcode from multiple compilation units may be
+linked together into a single fat binary per GPU architecture.
----------------
yxsamliu wrote:
will revise
https://github.com/llvm/llvm-project/pull/168566
More information about the cfe-commits
mailing list