[clang] f50a7c7 - [LinkerWrapper] Fix optimized debugging builds for NVPTX LTO

Tue Sep 27 08:49:35 PDT 2022

Author: Joseph Huber
Date: 2022-09-27T10:49:17-05:00
New Revision: f50a7c7a26e074231cc9a60a3fcaef7520ceb67f

URL: https://github.com/llvm/llvm-project/commit/f50a7c7a26e074231cc9a60a3fcaef7520ceb67f
DIFF: https://github.com/llvm/llvm-project/commit/f50a7c7a26e074231cc9a60a3fcaef7520ceb67f.diff

LOG: [LinkerWrapper] Fix optimized debugging builds for NVPTX LTO

The ptxas assembler does not allow the `-g` flag along with
optimizations. Normally this is degraded to line info in the driver, but
when using LTO we did not have this step and the linker wrapper was not
correctly degrading the option. Note that this will not work if the user
does not pass `-g` again to the linker invocation. That will require
setting some flags in the binary to indicate that debugging was used
when building.

This fixes #57990

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D134660

Added: 
    

Modified: 
    clang/test/Driver/linker-wrapper.c
    clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Removed: 
    


################################################################################
diff  --git a/clang/test/Driver/linker-wrapper.c b/clang/test/Driver/linker-wrapper.c
index dd0fba763cd46..e8c127501e26e 100644

--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -24,10 +24,19 @@
 // RUN:   --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
 // RUN:   --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
 // RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
-// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --device-debug \
-// RUN:   --linker-path=/usr/bin/ld -- %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=NVPTX_LINK_DEBUG
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --device-debug -O0 \
+// RUN:   --linker-path=/usr/bin/ld -- %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=NVPTX-LINK-DEBUG
 
-// NVPTX_LINK_DEBUG: nvlink{{.*}}-m64 -g -o {{.*}}.out -arch sm_70 {{.*}}.o {{.*}}.o
+// NVPTX-LINK-DEBUG: nvlink{{.*}}-m64 -g -o {{.*}}.out -arch sm_70 {{.*}}.o {{.*}}.o
+
+// RUN: clang-offload-packager -o %t.out \
+// RUN:   --image=file=%S/Inputs/dummy-bc.bc,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
+// RUN:   --image=file=%S/Inputs/dummy-bc.bc,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
+// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --device-debug -O2 \
+// RUN:   --linker-path=/usr/bin/ld -- %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=NVPTX-LINK-DEBUG-LTO
+
+// NVPTX-LINK-DEBUG-LTO: ptxas{{.*}}-m64 -o {{.*}}.cubin -O2 --gpu-name sm_70 -lineinfo {{.*}}.s
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \

diff  --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index d29c4f93d60f7..40825ac831a50 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -295,8 +295,10 @@ Expected<StringRef> assemble(StringRef InputFile, const ArgList &Args,
   CmdArgs.push_back(Args.MakeArgString("-" + OptLevel));
   CmdArgs.push_back("--gpu-name");
   CmdArgs.push_back(Arch);
-  if (Args.hasArg(OPT_debug))
+  if (Args.hasArg(OPT_debug) && OptLevel[1] == '0')
     CmdArgs.push_back("-g");
+  else if (Args.hasArg(OPT_debug))
+    CmdArgs.push_back("-lineinfo");
   if (RDC)
     CmdArgs.push_back("-c");