[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

Sat Aug 19 11:23:42 PDT 2023

MaskRay added inline comments.

================
Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:9
+
+// Check that -fsplit-machine-functions is passed to both x86 and cuda compilation and does not cause driver error.
+// MFS2: -fsplit-machine-functions
----------------
MaskRay wrote:
> tra wrote:
> > shenhan wrote:
> > > tra wrote:
> > > > shenhan wrote:
> > > > > tra wrote:
> > > > > > We will still see a warning, right? So, for someone compiling with `-Werror` that's going to be a problem.
> > > > > > 
> > > > > > Also, if the warning is issued from the top-level driver, we may not even be able to suppress it when we disable splitting on GPU side with `-Xarch_device -fno-split-machine-functions`.
> > > > > > 
> > > > > > 
> > > > > > We will still see a warning, right?
> > > > > Yes, there still will be a warning. We've discussed it and we think that pass -fsplit-machine-functions in this case is not a proper usage and a warning is warranted, and it is not good that skip doing split silently while uses explicitly ask for it.
> > > > > 
> > > > > > Also, if the warning is issued from the top-level driver
> > > > > The warning will not be issued from the top-level driver, it will be issued when configuring optimization passes.
> > > > > So:
> > > > > 
> > > > > 
> > > > >   - -fsplit-machine-functions -Xarch_device -fno-split-machine-functions
> > > > > Will enable MFS for host, disable MFS for gpus and without any warnings.
> > > > > 
> > > > >   - -Xarch_host -fsplit-machine-functions
> > > > > The same as the above
> > > > > 
> > > > >   - -Xarch_host -fsplit-machine-functions -Xarch_device -fno-split-machine-functions
> > > > > The same as the above
> > > > > 
> > > > > We've discussed it and we think that pass -fsplit-machine-functions in this case is not a proper usage and a warning is warranted, and it is not good that skip doing split silently while uses explicitly ask for it.
> > > > 
> > > > I would agree with that assertion if we were talking exclusively about CUDA compilation.
> > > > However, a common real world use pattern is that the flags are set globally for all C++ compilations, and then CUDA compilations within the project need to do whatever they need to to keep things working. The original user intent was for the option to affect the host compilation. There's no inherent assumption that it will do anything useful for the GPU.
> > > > 
> > > > In number of similar cases in the past we did settle on silently ignoring some top-level flags that we do expect to encounter in real projects, but which made no sense for the GPU. E.g. sanitizers. If the project is built w/ sanitizer enabled, the idea is to sanitize the host code, The GPU code continues to be built w/o sanitizer enabled. 
> > > > 
> > > > Anyways, as long as we have a way to deal with it it's not a big deal one way or another.
> > > > 
> > > > > -fsplit-machine-functions -Xarch_device -fno-split-machine-functions
> > > > > Will enable MFS for host, disable MFS for gpus and without any warnings.
> > > > 
> > > > OK. This will work.
> > > > 
> > > > 
> > > > In number of similar cases in the past we did settle on silently ignoring some top-level flags that we do expect to encounter in real projects, but which made no sense for the GPU. E.g. sanitizers. If the project is built w/ sanitizer enabled, the idea is to sanitize the host code, The GPU code continues to be built w/o sanitizer enabled.
> > > 
> > > Can I understand it this way - if the compiler is **only** building for CPUs, then silently ignore any optimization flags is not a good behavior. If the compiler is building CPUs and GPUs, it is still not a good behavior to silently ignore optimization flags for CPUs, but it is probably ok to silently ignore optimization flags for GPUs.
> > > 
> > > > OK. This will work.
> > > Thanks for confirming.
> > >  it is probably ok to silently ignore optimization flags for GPUs.
> > 
> > In this case, yes. 
> > 
> > I think the most consistent way to handle the situation is to keep the warning in place at cc1 compiler level, but change the driver behavior (and document it) so that it does not pass the splitting options to offloading sub-compilations. This way we'll do the sensible thing for the most common use case, yet would still warn if the user tries to enable the splitting where they should not (e.g. by using `-Xclang -fsplit-machine-functions` during CUDA compilation)
> > 
> > 
> > 
> > 
> There are excessive spaces before `%clang`. We should keep just one space: `RUN: %clang`
I agree with @tra's analysis. Either do nothing on Clang side and requiring `-fsplit-machine-functions -Xarch_device -fno-split-machine-functions` or ignoring the option when creating a device job works for me.

This patch changed the behavior in an unintended direction.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157750/new/

https://reviews.llvm.org/D157750