[PATCH] D120272: [CUDA] Add driver support for compiling CUDA with the new driver

Fri Apr 22 11:11:12 PDT 2022

tra added a subscriber: rnk.
tra added inline comments.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6225
   if (IsCuda || IsHIP) {
-    if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false))
+    if (Args.hasArg(options::OPT_fno_gpu_rdc) && IsCudaDevice &&
+        Args.hasArg(options::OPT_foffload_new_driver))
----------------
This has to be `Args.hasArg(options::OPT_fno_gpu_rdc) && Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false) == false`

E.g. we don't want a warning if we have `-foffload-new-driver -fno-gpu-rdc -fgpu-rdc`.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6228
+        D.Diag(diag::warn_drv_no_rdc_new_driver) 
+            << "SampleUse with PGO options";
+    if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false) ||
----------------
The warning does not take any parameters and this one looks wrong anyways.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6896
+      if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
+                       false))
         CmdArgs.push_back("-fgpu-rdc");
----------------
I'm not sure why we're no longer checking for `OPT_foffload_new_driver` here. Don't we want to have the same RDC mode on the host and device sides?
I think we do as that affects the way we mangle some symbols and it has to be consistent on both sides.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6223-6224
   if (IsCuda || IsHIP) {
-    if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false))
+    if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false) ||
+        Args.hasArg(options::OPT_foffload_new_driver))
       CmdArgs.push_back("-fgpu-rdc");
----------------
jhuber6 wrote:
> tra wrote:
> > tra wrote:
> > > jhuber6 wrote:
> > > > tra wrote:
> > > > > If user specifies both `-fno-gpu-rdc` and `-foffload-new-driver` we would still enable RDC compilation.
> > > > > We may want to at least issue a warning. 
> > > > > 
> > > > > Considering that  we have multiple places where we may check for `-f[no]gpu-rdc` we should make sure we don't get different ideas whether RDC has been enabled.
> > > > > I think it may make sense to provide a common way to figure it out. Either via a helper function that would process CLI arguments or calculate it once and save it somewhere.
> > > > I haven't quite finalized how to handle this. The new driver should be compatible with a non-RDC build since we simply wouldn't embed the device image or create offloading entries. It's a little bit more difficult here since the new method is opt-in so it requires a flag. We should definitely emit a warning if both are enabled (I'm assuming there's one for passing both `fgpu-rdc` and `fno-gpu-rdc`). I'll add one in.
> > > > 
> > > > Also we could consider the new driver *the* RDC in the future which would be the easiest. The problem is if we want to support CUDA's method of RDC considering how other build systems seem to expect it. I could see us embedding the fatbinary in the object file, even if unused, just so that cuobjdump works. However we couldn't support the generation of `__cudaRegisterFatBinary_nv....` functions because then those would cause linker errors. WDYT?
> > > > I'm assuming there's one for passing both fgpu-rdc and fno-gpu-rdc
> > > 
> > > This is not a valid assumption. The whole idea behind `-fno-something` is that the options can be overridden. E.g. if the build specifies a standard set of compiler options, but we need to override some of them when building a particular file. We can only do so by appending to the standard options. Potentially we may end up having those options overridden again. While it's not very common, it's certainly possible. It's also possible to start with '-fno-gpu-rdc' and then override it with `-fgpu-rdc`.
> > > 
> > > In this case, we care about the final state of RDC specified by -f*gpu-rdc options, not by the fact that `-fno-gpu-rdc` is present. `Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false)` does exactly that -- gives you the final state. If it returns false, but we have `-foffload-new-driver`, then we need a warning as these options are contradictory.
> > > 
> > > The new driver should be compatible with a non-RDC build 
> > 
> > In that case, we don't need a warning, but we do need a test verifying this behavior.
> > 
> > > Also we could consider the new driver *the* RDC in the future which would be the easiest. 
> > 
> > SGTM. I do not know how it all will work out in the end. Your proposed model makes a lot of sense, and I'm guardedly optimistic about it.
> > Eventually we would deprecate RDC options, but we still need to work sensibly when user specifies a mix of these options. 
> > 
> > 
> > In that case, we don't need a warning, but we do need a test verifying this behavior.
> > 
> It's possible but I don't have the logic here to do it, figured we can cross that bridge later.
> 
> > SGTM. I do not know how it all will work out in the end. Your proposed model makes a lot of sense, and I'm guardedly optimistic about it.
> >
> So the only downsides I know of, is that we don't currently replicate CUDA's magic to JIT RDC code (We can do this with LTO anyway), and that registering offload entries relies on the linker defining `__start / __stop` variables, which I don't know if linkers on Windows / MacOS provide. I'd be really interested if someone on the LLD team knew the answer to that.
> relies on the linker defining __start / __stop variables, which I don't know if linkers on Windows / MacOS provide. I'd be really interested if someone on the LLD team knew the answer to that.

@MaskRay, @rnk - would you happen to know the answer?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120272/new/

https://reviews.llvm.org/D120272