[clang] [CUDA] Include PTX in non-RDC mode using the new driver (PR #84367)
Joseph Huber via cfe-commits
cfe-commits at lists.llvm.org
Thu Mar 7 12:10:27 PST 2024
================
@@ -4625,7 +4625,15 @@ Action *Driver::BuildOffloadingActions(Compilation &C,
DDeps.add(*A, *TCAndArch->first, TCAndArch->second.data(), Kind);
OffloadAction::DeviceDependences DDep;
DDep.add(*A, *TCAndArch->first, TCAndArch->second.data(), Kind);
+
+ // Compiling CUDA in non-RDC mode uses the PTX output if available.
+ for (Action *Input : A->getInputs())
+ if (Kind == Action::OFK_Cuda && A->getType() == types::TY_Object &&
+ !Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
----------------
jhuber6 wrote:
Yeah, I don't have my finger on the pulse of the CUDA users here. I think we want this patch to match the current behavior with `--cuda-include-ptx` as it seems to make the decision whether or not to include it at job creation time. We could then potentially change the default of `--cuda-include-ptx` if that's the preferred solution.
https://github.com/llvm/llvm-project/pull/84367
More information about the cfe-commits
mailing list