[PATCH] D103579: [LTO] Fix -fwhole-program-vtables handling after HIP ThinLTO patch

Thu Jun 3 13:11:15 PDT 2021

tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

LGTM for CUDA.

@yaxunl Sam, does the change make sense for HIP?

================
Comment at: clang/test/Driver/cuda-options.cu:190-192
+// RUN: | FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
+// RUN:    -check-prefix HOST -check-prefix INCLUDES-DEVICE \
+// RUN:    -check-prefix NOLINK -check-prefix THINLTOWPD %s
----------------
Nit: This could be combined into `--check-prefixes=DEVICE,DEVICE-NOSAVE,....`

================
Comment at: clang/test/Driver/hip-options.hip:63
 // RUN: %clang -### -target x86_64-unknown-linux-gnu -nogpuinc -nogpulib \
-// RUN:   --cuda-gpu-arch=gfx906 -foffload-lto=thin %s 2>&1 \
-// RUN:   | FileCheck -check-prefix=THINLTO %s
+// RUN:   --cuda-gpu-arch=gfx906 -foffload-lto=thin -fwhole-program-vtables %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=HIPTHINLTO %s
----------------
tejohnson wrote:
> tra wrote:
> > tejohnson wrote:
> > > tra wrote:
> > > > > caused a regression when compiling cuda code with -flto=thin -fwhole-program-vtables.
> > > > 
> > > > We should add a CUDA test for that. This test only covers HIP compilation.
> > > AFAICT there are no existing Cuda lto tests in clang/test/Driver that I could add -fwhole-program-vtables to.
> > > 
> > > However, for the purposes of this bug fix, I think adding the below testing here should be sufficient - it triggers exactly the same way as when I saw this in an internal cuda build.
> > > AFAICT there are no existing Cuda lto tests in clang/test/Driver that I could add -fwhole-program-vtables to.
> > 
> > `cuda-options.cu` would be the right place.
> > 
> > > However, for the purposes of this bug fix, I think adding the below testing here should be sufficient - it triggers exactly the same way as when I saw this in an internal cuda build.
> > 
> > The key difference is that HIP does support LTO on the GPU side, but CUDA's does not, which suggests that their handling of lto-related flags is likely different and worth testing that we do see the expected behavior. E.g. HIP compilation with `-foffload-lto=thin` does propagate LTO flags to device compilation, but CUDA compilation should not (but should still compile with LTO enabled on the host.
> > 
> > While this patch does fix the issue for both CUDA and HIP, it's good to have a test which demonstrates how we expect HIP/CUDA to behave. Right now we only have that for HIP.
> > 
> Added the cuda test, and confirmed it fails with the error without my patch.
Thank you.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103579/new/

https://reviews.llvm.org/D103579