[llvm] e731a26 - [DebugInfo][Split DWARF][LTO]: Ensure only a single CU is emitted

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Wed May 31 17:21:11 PDT 2023


Author: David Blaikie
Date: 2023-06-01T00:21:00Z
New Revision: e731a2678c7cf81b1d3817489a52b519cb14f85e

URL: https://github.com/llvm/llvm-project/commit/e731a2678c7cf81b1d3817489a52b519cb14f85e
DIFF: https://github.com/llvm/llvm-project/commit/e731a2678c7cf81b1d3817489a52b519cb14f85e.diff

LOG: [DebugInfo][Split DWARF][LTO]: Ensure only a single CU is emitted

Split DWARF doesn't handle LTO of any form (roughly there's an
assumption that each dwo file will have one CU - it's not explicitly
documented, nor explicitly handled, so the ecosystem isn't really well
understood/tested/etc).

This had previously been handled by implementing (& disabling by
default) the `-split-dwarf-cross-cu-references` flag, which would
disable use of ref_addr across two dwo CUs.

This worked for a while, at least in LTO (it didn't address Split
DWARF+Full LTO, but that's an unlikely combination, as the benefits of
Split DWARF are more limited in a full LTO build) - because the only
source of cross-CU references was inlined functions, so by making those
non-cross-CU (by moving the referenced inlined function DWARF
description into the referencing CU) the result was one CU per dwo.

But recently the Function Specialization pass was added to the ThinLTO
pipeline, which caused imported functions that may not be inlined to be
emitted by a backend compile. This meant foreign CU entities (not just
abstract origins/cross-CU referenced entities)/standalone foreign CUs
could be emitted by a backend compile.

The end result was, due to a bug* in binutils dwp (I think basically
it saw two CUs in a single dwo and reprocessed the offsets in the shared
debug_str_offsets.dwo section) this situation lead to corrupted strings.

So to make this more robust, I've generalized the definition of the
`-split-dwarf-cross-cu-references` flag (perhaps it should be renamed at
this point, but it's /really/ niche, doubt anyone's using it - more or
less there for experimentation when we get around to figuring out
spec'ing LTO+Split DWARF) to mean "single CU in a dwo file" and added
more general handling for this.

There's certainly some weird corner cases that could come up in terms of
"how do we choose which CU to put everything in" - for now it's "first
come, first served" which is probably going to be OK for ThinLTO - the
base module will have the first functions and first CU, imported
fragments will come after that. For LTO the choice will be fairly
arbitrary - but, again, essentially whichever module comes first.

* Arguably a bug in binutils dwp, but since the feature isn't well
  specified, I'd rather avoid dabbling in this uncertain area and ensure
  LLVM doesn't produce especially novel DWARF (dwos with multiple CUs)
  regardless of whether binutils dwp would/should be fixed. I'm not
  confident debuggers could read such a dwo file well, etc.

Added: 
    

Modified: 
    llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
    llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
    llvm/test/DebugInfo/X86/split-dwarf-cross-cu-gmlt-g.ll
    llvm/test/DebugInfo/X86/split-dwarf-cross-unit-reference.ll
    llvm/test/DebugInfo/X86/string-offsets-table-order.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index ba08680e9d715..a24e6e2e99229 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -1094,6 +1094,13 @@ DwarfDebug::getOrCreateDwarfCompileUnit(const DICompileUnit *DIUnit) {
   if (auto *CU = CUMap.lookup(DIUnit))
     return *CU;
 
+  if (useSplitDwarf() &&
+      !shareAcrossDWOCUs() &&
+      (!DIUnit->getSplitDebugInlining() ||
+       DIUnit->getEmissionKind() == DICompileUnit::FullDebug) &&
+      !CUMap.empty()) {
+    return *CUMap.begin()->second;
+  }
   CompilationDir = DIUnit->getDirectory();
 
   auto OwnedUnit = std::make_unique<DwarfCompileUnit>(
@@ -1297,6 +1304,8 @@ void DwarfDebug::finalizeModuleInfo() {
   if (CUMap.size() > 1)
     DWOName = Asm->TM.Options.MCOptions.SplitDwarfFile;
 
+  bool HasEmittedSplitCU = false;
+
   // Handle anything that needs to be done on a per-unit basis after
   // all other generation.
   for (const auto &P : CUMap) {
@@ -1315,6 +1324,10 @@ void DwarfDebug::finalizeModuleInfo() {
     bool HasSplitUnit = SkCU && !TheCU.getUnitDie().children().empty();
 
     if (HasSplitUnit) {
+      (void)HasEmittedSplitCU;
+      assert((shareAcrossDWOCUs() || !HasEmittedSplitCU) &&
+             "Multiple CUs emitted into a single dwo file");
+      HasEmittedSplitCU = true;
       dwarf::Attribute attrDWOName = getDwarfVersion() >= 5
                                          ? dwarf::DW_AT_dwo_name
                                          : dwarf::DW_AT_GNU_dwo_name;
@@ -2267,7 +2280,7 @@ void DwarfDebug::endFunctionImpl(const MachineFunction *MF) {
 
   LexicalScope *FnScope = LScopes.getCurrentFunctionScope();
   assert(!FnScope || SP == FnScope->getScopeNode());
-  DwarfCompileUnit &TheCU = *CUMap.lookup(SP->getUnit());
+  DwarfCompileUnit &TheCU = getOrCreateDwarfCompileUnit(SP->getUnit());
   if (TheCU.getCUNode()->isDebugDirectivesOnly()) {
     PrevLabel = nullptr;
     CurFn = nullptr;

diff  --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index c7c40087a5817..c11e3d2c9f832 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -543,7 +543,7 @@ void DwarfUnit::addAccess(DIE &Die, DINode::DIFlags Flags) {
 }
 
 DIE *DwarfUnit::getOrCreateContextDIE(const DIScope *Context) {
-  if (!Context || isa<DIFile>(Context))
+  if (!Context || isa<DIFile>(Context) || isa<DICompileUnit>(Context))
     return &getUnitDie();
   if (auto *T = dyn_cast<DIType>(Context))
     return getOrCreateTypeDIE(T);

diff  --git a/llvm/test/DebugInfo/X86/split-dwarf-cross-cu-gmlt-g.ll b/llvm/test/DebugInfo/X86/split-dwarf-cross-cu-gmlt-g.ll
index ac5a10652ba92..03a444976e3e9 100644
--- a/llvm/test/DebugInfo/X86/split-dwarf-cross-cu-gmlt-g.ll
+++ b/llvm/test/DebugInfo/X86/split-dwarf-cross-cu-gmlt-g.ll
@@ -8,10 +8,12 @@
 ; CHECK-NEXT:   DW_AT_decl_file (0x02)
 ; CHECK-NEXT:   DW_AT_decl_line (4)
 
-; Function Attrs: noinline nounwind optnone uwtable mustprogress
-define dso_local void @_Z2f1v() local_unnamed_addr #0 !dbg !12 {
+; Function Attrs: norecurse nounwind uwtable mustprogress
+define dso_local i32 @main() local_unnamed_addr #2 !dbg !26 {
 entry:
-  ret void, !dbg !15
+  tail call void @_Z2f1v() #3, !dbg !28
+  tail call void @_Z2f1v() #3, !dbg !30
+  ret i32 0, !dbg !32
 }
 
 ; Function Attrs: nounwind uwtable mustprogress
@@ -21,6 +23,12 @@ entry:
   ret void, !dbg !22
 }
 
+; Function Attrs: noinline nounwind optnone uwtable mustprogress
+define dso_local void @_Z2f1v() local_unnamed_addr #0 !dbg !12 {
+entry:
+  ret void, !dbg !15
+}
+
 ; Function Attrs: nounwind uwtable mustprogress
 define dso_local void @_Z2f2v() local_unnamed_addr #1 !dbg !23 {
 entry:
@@ -28,14 +36,6 @@ entry:
   ret void, !dbg !25
 }
 
-; Function Attrs: norecurse nounwind uwtable mustprogress
-define dso_local i32 @main() local_unnamed_addr #2 !dbg !26 {
-entry:
-  tail call void @_Z2f1v() #3, !dbg !28
-  tail call void @_Z2f1v() #3, !dbg !30
-  ret i32 0, !dbg !32
-}
-
 attributes #0 = { noinline nounwind optnone uwtable mustprogress "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
 attributes #1 = { nounwind uwtable mustprogress "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
 attributes #2 = { norecurse nounwind uwtable mustprogress "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

diff  --git a/llvm/test/DebugInfo/X86/split-dwarf-cross-unit-reference.ll b/llvm/test/DebugInfo/X86/split-dwarf-cross-unit-reference.ll
index 3eddb9273226f..8d2af08aea2d6 100644
--- a/llvm/test/DebugInfo/X86/split-dwarf-cross-unit-reference.ll
+++ b/llvm/test/DebugInfo/X86/split-dwarf-cross-unit-reference.ll
@@ -1,12 +1,12 @@
 ; RUN: llc -mtriple=x86_64-linux -split-dwarf-cross-cu-references -split-dwarf-file=foo.dwo -filetype=obj -o %t < %s
-; RUN: llvm-objdump -r %t | FileCheck %s
-; RUN: llvm-dwarfdump -v -debug-info %t | FileCheck --check-prefix=ALL --check-prefix=INFO --check-prefix=DWO --check-prefix=CROSS %s
-; RUN: llvm-dwarfdump -v -debug-info %t | FileCheck --check-prefix=ALL --check-prefix=INFO %s
+; RUN: llvm-objdump -r %t | FileCheck --check-prefix=CHECK --check-prefix=RELO_CROSS %s
+; RUN: llvm-dwarfdump -v -debug-info %t | FileCheck --check-prefix=ALL --check-prefix=DWO --check-prefix=CROSS %s
+; RUN: llvm-dwarfdump -v -debug-info %t | FileCheck --check-prefix=ALL %s
 
 ; RUN: llc -mtriple=x86_64-linux -split-dwarf-file=foo.dwo -filetype=obj -o %t < %s
-; RUN: llvm-objdump -r %t | FileCheck %s
+; RUN: llvm-objdump -r %t | FileCheck --check-prefix=CHECK %s
 ; RUN: llvm-dwarfdump -v -debug-info %t | FileCheck --check-prefix=ALL --check-prefix=DWO --check-prefix=NOCROSS %s
-; RUN: llvm-dwarfdump -v -debug-info %t | FileCheck --check-prefix=ALL --check-prefix=INFO %s
+; RUN: llvm-dwarfdump -v -debug-info %t | FileCheck --check-prefix=ALL %s
 
 ; Testing cross-CU references for types, subprograms, and variables
 ; Built from code something like this:
@@ -48,8 +48,8 @@
 ; CHECK-NOT: RELOCATION RECORDS
 ; Expect one relocation in debug_info, from the inlined f1 in foo to its
 ; abstract origin in bar
-; CHECK: R_X86_64_32 .debug_info
-; CHECK-NOT: RELOCATION RECORDS
+; RELO_CROSS: R_X86_64_32 .debug_info
+; Expect no relocations in debug_info when disabling multiple CUs in Split DWARF
 ; CHECK-NOT: .debug_info
 ; CHECK: RELOCATION RECORDS
 ; CHECK-NOT: .rel{{a?}}.debug_info.dwo
@@ -75,29 +75,22 @@
 ; DWO:       DW_TAG_formal_parameter
 ; DWO:         DW_AT_abstract_origin [DW_FORM_ref4] {{.*}}{0x[[F1T]]}
 
-; ALL: Compile Unit
-; ALL: DW_TAG_compile_unit
-; DWO:   DW_AT_name {{.*}} "bar.cpp"
-; NOCROSS: 0x[[BAR_F1:.*]]: DW_TAG_subprogram
-; NOCROSS: DW_AT_name {{.*}} "f1"
-; NOCROSS: 0x[[BAR_F1T:.*]]: DW_TAG_formal_parameter
-; NOCROSS:   DW_AT_name {{.*}} "t"
-; NOCROSS:   DW_AT_type [DW_FORM_ref4] {{.*}}{0x[[BAR_T1:.*]]}
-; NOCROSS: NULL
-; NOCROSS: 0x[[BAR_T1]]: DW_TAG_structure_type
-; NOCROSS: DW_AT_name {{.*}} "t1"
+; NOCROSS-NOT: DW_TAG_compile_unit
+; CROSS: Compile Unit
+; CROSS: DW_TAG_compile_unit
+; CROSS:   DW_AT_name {{.*}} "bar.cpp"
 ; ALL:   DW_TAG_subprogram
 ; ALL:     DW_AT_name {{.*}} "bar"
 ; DWO:     DW_TAG_formal_parameter
 ; DWO:       DW_AT_name {{.*}} "t"
 ; CROSS:     DW_AT_type [DW_FORM_ref_addr] (0x00000000[[T1]]
-; NOCROSS:   DW_AT_type [DW_FORM_ref4] {{.*}}{0x[[BAR_T1]]}
+; NOCROSS:   DW_AT_type [DW_FORM_ref4] {{.*}}{0x[[T1]]}
 ; ALL:     DW_TAG_inlined_subroutine
-; INFO:     DW_AT_abstract_origin [DW_FORM_ref_addr] (0x00000000[[F1]]
-; NOCROSS:   DW_AT_abstract_origin [DW_FORM_ref4] {{.*}}{0x[[BAR_F1]]}
+; CROSS:     DW_AT_abstract_origin [DW_FORM_ref_addr] (0x00000000[[F1]]
+; NOCROSS:   DW_AT_abstract_origin [DW_FORM_ref4] {{.*}}{0x[[F1]]}
 ; DWO:       DW_TAG_formal_parameter
 ; CROSS:       DW_AT_abstract_origin [DW_FORM_ref_addr] (0x00000000[[F1T]]
-; NOCROSS:     DW_AT_abstract_origin [DW_FORM_ref4] {{.*}}{0x[[BAR_F1T]]
+; NOCROSS:     DW_AT_abstract_origin [DW_FORM_ref4] {{.*}}{0x[[F1T]]
 
 %struct.t1 = type { i32 }
 

diff  --git a/llvm/test/DebugInfo/X86/string-offsets-table-order.ll b/llvm/test/DebugInfo/X86/string-offsets-table-order.ll
index ca159eea615f6..6a61c458dec3e 100644
--- a/llvm/test/DebugInfo/X86/string-offsets-table-order.ll
+++ b/llvm/test/DebugInfo/X86/string-offsets-table-order.ll
@@ -11,13 +11,10 @@
 ; in 
diff erent order.
 
 ; CHECK: .debug_info contents:
-; CHECK:   DW_TAG_skeleton_unit
-; CHECK:     DW_AT_comp_dir [DW_FORM_strx1] (indexed (00000000) string = "X3")
-; CHECK:   DW_TAG_skeleton_unit
-; CHECK:     DW_AT_comp_dir [DW_FORM_strx1] (indexed (00000001) string = "X2")
-; CHECK:   DW_TAG_skeleton_unit
-; CHECK:     DW_AT_comp_dir [DW_FORM_strx1] (indexed (00000002) string = "X1")
-; CHECK: .debug_info.dwo contents:
+; CHECK:   DW_TAG_compile_unit
+; CHECK:     DW_AT_name [DW_FORM_strx1] (indexed (00000000) string = "X1")
+; CHECK:     DW_AT_name [DW_FORM_strx1] (indexed (00000002) string = "X2")
+; CHECK:     DW_AT_name [DW_FORM_strx1] (indexed (00000003) string = "X3")
 
 ; CHECK: .debug_str contents:
 ; CHECK: 0x[[X3:[0-9a-f]*]]: "X3"
@@ -26,11 +23,9 @@
 
 ; CHECK: .debug_str_offsets contents:
 ; CHECK: Format = DWARF32, Version = 5
-; CHECK-NEXT: [[X3]] "X3"
-; CHECK-NEXT: [[X2]] "X2"
-; CHECK-NEXT: [[X1]] "X1"
-; CHECK-NEXT: "foo.dwo"
-; CHECK-EMPTY:
+; CHECK: [[X3]] "X3"
+; CHECK: [[X1]] "X1"
+; CHECK: [[X2]] "X2"
 
 
 


        


More information about the llvm-commits mailing list