[PATCH] D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue May 24 13:09:09 PDT 2022


tra added inline comments.


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:6836
+
+  // If the CUID is not specified we try to generate a unique postfix.
+  if (getLangOpts().CUID.empty()) {
----------------
jhuber6 wrote:
> tra wrote:
> > > However, [CUID] is not always availible. 
> > 
> > The question is -- when and why is it not available? I'm getting the feeling that we're fixing the consequence here, not the root cause.
> > 
> > Is there a reason we can't make sure that the driver always generates a cuid for offload subcompilations and error out if it's needed but is not provided?
> > That would make this fallback unnecessary and would be a more robust approach in general.
> > 
> So, I'm more in favor of this approach because it doesn't require extra intervention from the compiler driver, this makes it less convoluted to do split compilation since we don't have an extra arguments. The way I would prefer it, is that we do this implicitly by default without requiring extra thought from the driver, but if it's not good enough we can support the manual `CUID` approach to let the user override it. I think this is a cleaner implementation, and is mostly coming from my support for CUDA in the new driver which currently doesn't implement the CUID as we do with the old driver. Generally I'd prefer things to behave independent of the driver, so we can consider host and device compilation more separately.
> So, I'm more in favor of this approach because it doesn't require extra intervention from the compiler driver

We need the driver intervention for any cc1 compilations anyways, so this does not buy us anything.  While you can run a sub-compilation manually with handcrafted cc1 flags, that's not a practical use case. The driver is the ultimate source of cc1 flags.

> this makes it less convoluted to do split compilation since we don't have an extra arguments.

For CUDA/HIP sub-compilation should be done with clang --cuda-host-only/--cuda-device-only.  Whether the driver supplies yet another cc1 option, --cuid=... makes no difference to the user launching such sub-compilation. 

> The way I would prefer it, is that we do this implicitly by default without requiring extra thought from the driver, but if it's not good enough we can support the manual CUID approach to let the user override it.

I agree that we can come up with something that will almost always work. Possibly even good enough for all practical purposes. However, if a better solution would take comparable effort, it would make sense to do things right and avoid adding technical debt. 

On the other hand, requiring the driver to supply identical cuid to all sub-compilations appears to be a better approach to me:
* Driver is the best place to do it, functionally. Driver has access to all user-provided inputs and is in position to guarantee that all subcompilations get the same cuid.
* Calculating CUID in the driver keeps relevant logic in one place. Doing it in the driver *and* in the codegen 
* Figuring out what inputs are relevant for calculation of CUID in cc1 invocation is error prone. E.g. we have to guess which cc1 options are relevant or not and is the driver would pass a macro to one subcompilation but not to another, we would end up generating mismatching CUID and would not have any way to notice that. Even when that's not the case, we would need to guess which flags, supplied by the driver, are relevant. At CC1 level that may be somewhat complicated as top-level options may expand to quite a few more cc1 options. E.g. we'll need to take into account `-std=...`, `--cuda-path=`, `-include ...`, `-I` (and other include paths)... All of that does not belong to the codegen.

The driver is already doing CUID computation, so I do not see any downsides to just letting it do its job, and I do believe it will be a better, and likely less complicated, solution.

> ... mostly coming from my support for CUDA in the new driver which currently doesn't implement the CUID as we do with the old driver

Right. That appears to be the key missing piece.

What are the obstacles for having CUID calculation done in the new driver. It should have all the info it needs. What am I missing?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125904/new/

https://reviews.llvm.org/D125904



More information about the cfe-commits mailing list