[clang] [CUDA] Include PTX in non-RDC mode using the new driver (PR #84367)

Artem Belevich via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 7 12:02:06 PST 2024


================
@@ -4625,7 +4625,15 @@ Action *Driver::BuildOffloadingActions(Compilation &C,
       DDeps.add(*A, *TCAndArch->first, TCAndArch->second.data(), Kind);
       OffloadAction::DeviceDependences DDep;
       DDep.add(*A, *TCAndArch->first, TCAndArch->second.data(), Kind);
+
+      // Compiling CUDA in non-RDC mode uses the PTX output if available.
+      for (Action *Input : A->getInputs())
+        if (Kind == Action::OFK_Cuda && A->getType() == types::TY_Object &&
+            !Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
----------------
Artem-B wrote:

I'm not quite sure why we would need to include PTX for RDC compilation.

In retrospect, including PTX by default with all compilations turned out to be a wrong default choice.
It's just a waste of space for most of the users, and it allows problems to go unnoticed for longer than they should (e.g. something was compiled for a wrong GPU).

Switching to the new driver is a good point to make a better choice. I would argue that we should not be including PTX by default or, if we do deem that it may be useful, only add it for the most recent chosen GPU variant, to provide some forward compatibility, not for all of them.

https://github.com/llvm/llvm-project/pull/84367


More information about the cfe-commits mailing list